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ES AND THEIR USES 

This invention was made with support from the Howard Hughes Medical Institute. The 
Government may have certain rights in this invention. 



RECEIVED 



10 INTRODUCTION 0CU1 ZQW 

Technical Field 

The field of this invention is segment polarity genes and their uses. 
Background 

Segment polarity genes were originally discovered as mutations in flies that change the 
1 5 pattern of body segment structures. Mutations in these genes cause animals to develop changed 
patterns on the surfaces of body segments; the changes affecting the pattern along the head to 
tail axis. Among the genes in this class are hedgehog, which encodes a secreted protein (HH), 
and patched, which encodes a protein structurally similar to transporter proteins, having twelve 
transmembrane domains (ptc), with two conserved glycosylation signals. 
20 The hedgehog gene of flies has at least three vertebrate relatives- Sonic hedgehog (Shh); 

Indian hedgehog (Ihh), and Desert hedgehog (DhhJ. Shh is expressed in a group of cells, at 
the posterior of each developing limb bud, that have an important role in signaling polarity to 
the developing limb. The Shh protein product, SHH, is a critical trigger of posterior limb 
development, and is also involved in polarizing the neural tube and somites along the dorsal 
25 ventral axis. Based on genetic experiments in flies, patched and hedgehog have antagonistic 
effects in development. The patched gene product, ptc, is widely expressed in fetal and adult 
tissues, and plays an important role in regulation of development. Ptc downregulates 
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5 Heemskcrk and DiNardo (1994) £dl 76:449-460; and Roelink et al (1994) Cell 76 :-76 1-775. 
Mapping of deleted regions on chromosome 9 in skin cancers is described in Habuchi 
et al (1995) Oncog ene 11; 1 671-1674, Quinn et al. (1 994) fiftnes Chromosome Cancer 
11:222-225; Quinn etal (1994)LIaYfiSl. Permrtol . 102:300-303; and Widdngc/fl/. (1994) 
esn^micL^^OS-Sl 1. 

10 Goriin (1987) Medicine 66:98-1 13 reviews nevoid basal cell carcinoma syndrome. The 

syndrome shows autosomal dominant inheritance with probably complete penetrance. About 
60% of the cases represent new mutations. Developmental abnormalities found with this 
syndrome include rib and craniofecial abnormalities, Polydactyly, syndactyly and spina bifida. 
Tumors found with the syndrome include basal cell carcinomas, fibromas of the ovaries and 

1 5 heart, cysts of the skin, jaws and mesentery, meningiomas and medulloblastomas. 

SUMMARY OF THE INVENTION 
Isolated nucleotide compositions and sequences are provided for patched (ptc) genes, 
including mammalian, e.g. human and mouse, and invertebrate homologs. Decreased 
20 expression of ptc is associated with the occurrence of human cancers, particularly basal 
- cell carcinomas and other tumors of epithelial tissues such as the skin. The cancers may be 
familial, having as a component of risk a germline mutation in the gene, or may be sporadic. 
Ptc, and its antagonist hedgehog, are useful in creating transgenic animal models for these 
human cancers. The ptc nucleic acid compositions find use in identifying homologous or 
25 related genes; in producing compositions that modulate the expression or function of its encoded 
protein, ptc; for gene therapy, mapping functional regions of the protein- and in studying 
associated physbtogEcal pathways. In addition, modulau'on of the gene activity in vivo is used 
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5 for prophylactic and therapeutic purposes, such as treatment of cancer, identification of cell type 
based on expression, and the like. Pic, anti-/?/c antibodies and pic nucleic acid sequences are 
useful as diagnostics for a genetic predisposition to cancer or developmental abnormality 
syndromes, and to identify specific cancers having mutations in this gene. 

1 0 BRIEF DESCRIPTION OF THE DRAWINGS 

Fig. 1 is a graph having a restriction map of about 10 kbp of the 5' region upstream from 
the initiation codon of Drosophila paiched gene and bar graphs of constructs of truncated 
portions of the 5' region joined to fl-galactosidase, where the constructs are introduced into fly 
cell lines for the production of embryos. The expression of fl-gal in the embryos is indicated 
15 in the right-hand table during early and late development of the embryo. The greater the 
number of +'s, the more intense the staining. 

Fig. 2 shows a summary of mutations found in the human paiched gene locus that are 
associated with basal cell nevus syndrome. Mutation (1) is found in sporadic basal cell 
carcinoma, and is a C to T transition in exon 3 at nucleotide 523 of the coding sequence, 
20 changing Leu 175 to Phein the first extracellular loop. Mutations 2-4 are found in hereditary 
basal carcinoma nevus syndrome. (2) is an insertion of 9 bp at nucleotide 2445, resulting in the 
insertion of an additional 3 amino acids after amino acid 8 1 5. (3) is a deletion of 1 1 bp, which 
removes nt 2442-2452 from the coding sequence. The resulting frameshift truncates the open 
reading frame after amino acid 813, \ist after the seventh transmembrane domain. (4) is a G to 
25 C alteration that changes two conserved nucleotides of the 3' splice site adjacent to exon 10, 
creating a non-functional splice she that truncates the protein after amino acid 449, in the second 
transmembrane region. 
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5 DATABASE REFERENCES FOR NUCLEOTIDE AND AMINO ACID SEQUENCES 
The sequence for the D. melanogaster patched gene has the Genbank accession 
mimber M28418. The sequence for the mouse patched gene has the Genbank accession 
number U30589-V46155. The sequence for the human patched gene has the Genbank 
accession number U59464. 

10 

DESCRIPTION OF THE SPECIFIC EMBODIMENTS 
Mammalian and invertebrate patched (ptc) gene compositions and methods for their 
isolation are provided. Of particular interest are the human and mouse homologs. Certain 
human cancers, e.g. basal cell carcinoma, transitional cell carcinoma of the bladder, 
1 5 meningiomas, medulloblastomas, etc., show decreased ptc activity, resulting from oncogenic 
mutations at the ptc locus. Many such cancers are sporadic, where the tumor cells have a 
somatic mutation in ptc. The basal cell nevus syndrome (BCNS), an inherited disorder, is 
associated with germline mutations in ptc. Such germline mutations may also be associated 
with other human cancers, including carcinomas, adenocarcinomas, sarcomas and the like. 
20 Decreased/?/*; activity is also associated with inherited developmental abnormalities, e.g. rib and 
craniofacial abnormalities, Polydactyly, syndactyly and spina bifida. 

Doc ptc genes and fragments thereof encoded protein, and anti-/?/c antibodies are useful 
in the identification of individuals predisposed to development of such cancers and 
developmental abnormalities, and in characterizing the phenotype of sporadic tumors that are 
25 associated with this gene, e.g., for diagnostic and/or prognostic benefit. The characterization 
is useful for prenatal screening, and in determining further treatment of the patient. Tumors 
may be typed or staged as to the/?/c status, e.g. by detection of mutated sequences, antibody 
detection of obotonnal protean products, and functional assays for altered ptc activity. The 
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5 encoded ptc protein is useful in drug screening for compositions that mimic ptc activity or 
expression, including altered forms of ptc protein, particularly with respect to ptc function as 
a tumor suppressor in oncogenesis. 

The human and mouse ptc gene sequences and isolated nucleic acid compositions are 
provided. In kientifying the mouse and human patched genes, cross-hybridization of DNA and 
10 amplification primers were employed to move through the evolutionary tree from the known 
Drosophilaptc sequence, identifying a number of invertebrate homologs. The human patched 
gene has been mapped to human chromosome band 9q22 3, and lies between the polymorphic 
markers D9S196 and D9S287 (a detailed map of human genome markers may be found in Dib 
etal (I 996) Nature 280-152-1 http://www.genethon.fr). 

15 DNA from a patient having a tumor or developmental abnormality, which may be 

associated vnthptc, is analyzed for the presence .of a predisposing mutation in the ptc gene. 
The presence of a mutated ptc sequence that affects the activity or expression of the gene 
product, ptc, confers an increased susceptibility to one or more of these conditions. Individuals 
are screened by analyzing their DNA for the presence of a predisposing oncogenic or 

20 developmental mutation, as compared to a normal sequence. A "normal" sequence of patched 
is provided in SEQ ID NO-. 1 8 (human). Specific mutations of interest include any mutation 
that leads to oncogenesis or developmental abnormalities, including insertions, substitutions and 
deletions in the coding region sequence, introns that affect splicing, promoter or enhancer that 
affect the activity and expression of the protein, 

25 Screening for tumors or developmental abnormalities may also be based on the 

functional or antigenic characteristics of the protein. Immunoassays designed to detect the 
normal ex absioirmsl p$c (protean may be used in screening. Where many diverse mutations lead 
to a particular disease phes&otype, functional protean assays have proven to be effective screening 
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5 tools. Such assays may be based on detecting changes in the Uanscriptional regulation 
mediated by pre, or may directly detect ptc transporter activity, or may involve antibody 
localization of patched in cells. 

Inheritance ofBCNS is autosomal dominant, although many cases are the result of new 
mutations. Diagnosis ofBCNS is performed by protein, DNA sequence or hybridization 
1 0 analysis of any convex, sample from a patient, e.g. biopsy material, blood sample, scrapings 
from cheek, etc. A typical patient genotype will have a predisposing mutation on one 
chromosome. In tumors and at least sometimes developmental^ affected tissues, loss of 
heterozygosity at the,* locus leads to aberrant cell and tissue behavior. When the normal 
copy of j* is lost, leaving only the reduced function mutant copy, abnormal cell growth and 
15 reducedcelllayeradhesionistheresult. Examples of specific,* mutations in BCNS patients 
area9bpinsertion«m2445oftl«c 0 dingse q uen^ to2452 
of the coding sequence. These result in insertions or deledons in the region of the seventh 
transmembrane domain. 

Prenatal diagnosis ofBCNS may be performed, particularly where there is a family 
20 ^oryofthedisease.e.g.anaJfectedp I« is desirable, although not required, 

in such cases to determine the specific predisposing mutation present in affected family 
members. A sample of fetal DNA such as an amniocentesis sample, fetal nucleated or white 
blood cells isolated from maternal blood, chorionic villus sample, etc. is analyzed for the 
presence of the predisposing mutation. Alternatively, a protein based assay, e.g. functional 
25 assay or immunoassay, is performed on fetal cells known to expressp/c. 

Sporadic tumor, associated with loss ofp ( c function include a number of carcinomas and 
other Wormed cells known to have deletions in the region of chromosome 9 q22> e.g. basal 
cdl cardno^ (Mfa, «!! carcinoma meningiomas, meduliomas, fibroma* of the 
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5 hem M d ovary, ^cardnonm of the lung, ovary, kidney and esophagus. Characterizadon 
of sporadic tumors will generally require analysis of tumor cell DNA, conveniently with a biopsy 
sample. A wide range of mutations are found in sporadic cases, up to and including deletion 
of the entire long arm of chromosome 9. Oncogenic mutations may delete one or more exons, 
e.g. 8 and 9, may affect the amino acid sequence such as of the extracellular loops or 
10 tramrnembrane domains, may cause truncation of the protein by introducing a frameshm or stop 
codon, etc. Specific examples of oncogenic mutations include a C to T transition at nt 523-1 
and deletions encompassing exon 9. C to T transitions are characteristic of ultraviolet 
mutagenesis, as expected with cases of skin cancer. 

Biochemical studies may be performed to determine whether a candidate sequence 
15 variation in the,* coding region or control regions is oncogenic. For example, a change in the 
promoter or enhancer sequence that downregulates expression of patched may result in 
predisposition to cancer. En*w levels of a candidate variant allele are compared to 
expression levels of the normal allele by various methods known in the art. Methods for 
determining promoter or enhancer strength include quantitation of the expressed natural protein; 
20 insertion of the variant control element into a vector with a reporter gene such as R- 
galactosidase, chJoraraptenical acetyltransferase, etc. that provides for convenient quantitation- 
and the like. The activity of the encoded />/c protein may be determined by comparison with 
the wfld^ypeprotein,e.g.by detection of transcriptional down-regulation of TGFP, Wnt family 
genes, pic itself, or reporter gene fusions involving these target genes. 
25 The hurmn patched (SEQ ID NO: 18) has a 4.5 kb open reading frame encoding 

a protein of 1447 amino acids. Including coding and noncoding sequences, it is about 89% 
identical at the nucleotide level to the mouse patched gene (SEQ ID NO-.09). The mouse 
pmckzd^ (SEQ m NO:09) encodes a protein (SEO ID NO: 1 0) that has about 3g% identical 
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5 amino acids to Drosophila ptc (SEQ ID N0.6), over about 1.200 amino acids. The butterfly 
homolog (SEQ ID NO:4) is 1,300 amino acids long and overall has a 50% amino acid identity 
to Qyptc (SEQ ID NO:6). A 267 bp exon from the beetle patched gene encodes an 89 amino 
acid protein fragment, which was found to be 44% and 51% identical to the corresponding 
regions of fly and butterfly ptc respectively. 
10 The DNA sequence encodings may be cDNA or genomic DNA or a fragment thereof 

The term "patched gene" shall be intended to mean the open reading frame encoding specific 
ptc polypeptides, as well as adjacent 5' and 3' non-coding nucleotide sequences involved in the 
regulation of expression, up to about 1 kb beyond the coding region, in either direction The 
gene may be introduced into an appropriate vector for extrachromosomal maintenance or for 
15 integration into the host. 

The term "cDNA" as used herein is intended to include all nucleic acids that share the 
arrangement of sequence elements found in native mature mRNA species, where sequence 
elements are exons, 3' and 5' non-coding regions. Normally MRNA species have contigubus 
exons, with the intervening introns deleted, to create a continuous open reading frame encoding 
20 ptc. 

The genomic/*: sequence has non-contiguous open reading frames, where introns 
interrupt the coding regioos. A genomic sequence of interest comprises the nucleic acid present 
between the initiation codon and the stop codon, as defined in the listed sequences, including 
all of the introns that are normally present in a native chromosome. It may further include the 
25 3' and 5' untranslated regions found in the mature MRNA It may farther include specific 
transcriptional and translational regulatory sequences, such as promoters, enhancers, etc., 
including about 1 kb of flanking genomic DNA at either the 5' or 3' end of the coding region. 
The genomic DNA my be isolated as a fragment of 50 kbp or smaller, and substantially free 
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5 of flanking chromosomal sequence. 

The nudac add composition of the subject invention encode all or apart ofthe subject 
polypeptides. Fragments may be obtained ofthe DNA sequence by chemically synthesizing 
oligonucleotide, in accordance with conventional methods, by restriction enzyme digestion, by 
PCR amplification, etc. For the most part, DNA fragments will be of at least 1 5 nt, usually at 
10 least 18 nt.moreusuaDyatkast about 50 nt Such small DNA fragments are useful as primers 
for PCR, hybridization screening, etc. Larger DNA fragments, i.e. greater than 100 nt are 
useful for production ofthe encoded polypeptide. For use in amplification reactions, such as 
PCR. a pair of primers will be used. The exact composition ofthe primer sequences is not 
critical to the invention, but for most applications the primers will hybridize to the subject 
15 sequence under stringent conditions, as known in the art. It is preferable to chose a pair of 
primers that will generate an amplification product of at least about 50 nt, preferably at least 
about 100 nt. Algorithms for the selection of primer sequences are generally known, and are 
available in commercial software packages. Amplification primers hybridize to complementary 
strands of DNA, and will prime towards each other. 
20 Thepfc genes are isolated and obtained in substantial purity, generally as other than an 

intact mammalian chromosome. Usually, the DNA will be obtained substantially free of other 
nucleic acid sequences that do not include ap/c sequence or fragment thereof, generally being 
at least about 50%, usually at least about 90% pure and are typically "recombinant", i.e. flanked 
by one or more nucleotides with which it is not normally associated on a naturally occurring 
25 chromosome. 

The DNA sequences are used in a variety of ways. They may be used as probes for 
identifying other patched genes. Mammalian homologs have substantial sequence similarity to 
the subjeca sequences, i.e. at least 75%, usually at least 90%, more usually at least 95% 
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5 cleaving agent, eg. . chelated metal ion. such as iron or chromium for cleavage of the gene; as 
an antisense sequence-, or the like. Modifications may include replacing oxygen of the 
phosphate ester, with sulfur or nitrogen, replacing the phosphate with phosphoramide. etc. 

A number of methods arc arable for analyzing genomic DNA sequences. Where lar ge 
amounts of DNA are available, the genomic DNA is used directly. Alternatively, the region of 

by conventional techniques, such as the polymerase chain reaction (PGR). The use of the 
polymerasechain reaction is described in Saiki, etal (I 985) 5^ 239@487. and a review 
of current technique, may be found in Sambrook, e, al. Moksukr. Chnh,- a r.^ . 
Manual CSH Press 1989, pp. 14.2-14.33. 

15 AdetectablelaWmayfci^^ Suitable labels include 

fluorochromes, e.g. fluorescein isothiocyanate (F1TCX rhodamine, Texas Red, phycoerythrin, 
auophycocyanin. S^arboxyfluorescein (6-FAM), 2-.r^i m eth 0 xy^,5'-dichloro^- 
carboxyfluorescein (JOE). 6<arboxy-Xrhodanune (ROX). 6-carboxy-2',4'.7',4;7- 
hexachlorofluorescein (HEX). S^arboxyfluorescein (5-FAM) or N,N,N,N-tetramethy.^ 
20 carboxyrhc^amine (TAMRAX radioactive labels, e.g. *P, "s, ^ etc. The label may be a two 
stage system, where the amplified DNA is conjugated to bio.in, haptens, etc. having a high 
affinity binding partner, e.g. .vidin. specific antibodies, etc., where the binding partner is 
conjugated to a detectable label. The label may be conjugated to one or both of the primers. 
Alternatively, the pool of nucleotide, used in the amplification is labeled, so as to incorporate 
25 the label Into the amplification product 

The amplified or cloned fragment may be sequenced by dideoxy or other methods, and 
the sequence of bases compared to the normal,* sequence. Hybridization with the variant 
ssquence my also uwl to Pennine its presence, by Southern blots, dot blots, etc. Single 
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5 strand conformational polymorphism (SSCP) analysis, denaturing gradient gd electrophoresis 
(DGGE), and hetcroduplex analysis in gel matrices are used to detect conformational changes 
created by DNA sequence variation as alterations in electrophorctic mobility. The hybridization 
pattern of a control and variant sequence to an array of oligonucleotide probes immobilized on 
a solid support, as described in WO 95/1 1995, may also be used as a means of detecting the 
10 presence of variant sequences. Alternatively, where a predisposing mutation creates or destroys 
a recognition site for a restriction endonudease, the fragment is digested with that endonuclease, 
and the products size fractionated to determine whether the fragment was digested. 
Fractionation is performed by gel electrophoresis, particularly acrylamide or agarose gels. 

The subject nucleic acids can be used to generate transgenic animals or site specific gene 
1 5 modifications in cefl lines. Transgenic animals may be made through homologous recombination, 
where the normal patched locus is altered. Alternatively, a nucleic acid construct is randomly 
integrated into the genome, Vectors for stable integration include plasmids, retroviruses and 
other animal viruses, YACS, and the like. 

The modified cells or animals are useful in the study of patched function and regulation. 
20 For example, a series of small deletions and/or substitutions may be made in the patched gene 
to determine the role of different exons in oncogenesis, signal transduction, etc. Of particular 
interest are transgenic animal models for carcinomas of the skin, where expression of ptc is 
specifically reduced or absent in skin cells. An alternative approach to transgenic models for this 
disease are those where one of the mammalian hedgehog genes, e.g. ShK IhK DhK are 
25 unregulated in slrin cells, or in other cell types. For models of skin abnormalities, one may use 
a skfo-specific promoter to drive expression of the transgene, or other inducible promoter that 
can be regulated in the animal model Such promoters include keratin gene promoters. Specific 
constructs of interest include anti-sense ptc, which will block ptc expression, expression of 
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5 dominant negative pte mutations, and over-expression of HH genes. A detectable marker, such 
at locZ may be introduced into the patched locus, where upregulation of patched expression will 
result in an easily detected change in phenotype. 

One may also provide for expression of the patched gene or variants thereof in cells or 
tissues where it is not normally expressed or at abnormal times of development. Thus, mouse 

1 0 models of spina bifida or abnormal motor neuron differentiation in the developing spinal cord 
are made available. In addition, by providing expression of pic protein in cells in which it is 
otherwise not normally produced, one can induce changes in cell behavior, e.g. through ptc 
mediated transcription modulation. 

DNA constructs for homologous recombination will comprise at least a portion of the 

15 patched or hedgehog gene with the desired genetic modification, and will include regions of 
homology to the target locus. DNA constructs for random integration need not include regions 
of homology to mediate recombination. Conveniently, markers for positive and negative 
selection are included. Methods for generating cells having targeted gene modifications through 
homologous recombination are known in the art. For various techniques for transfecting 

20 mammalian cells, see Keown et al (1 9901 Methods in Enzvmoloav 185:527-537. 

For embryonic stem (ES) cells, an ES cell line may be employed, or ES cells may be obtained 
freshly from a host, e.g. mouse, rat, guinea pig, etc. Such cells are grown on an appropriate 
fibroblast-feeder layer or grown in the presence of leukemia inhibiting factor (LIF). When ES 
cells have been transformed, they may be used to produce transgenic animals. After 

25 transformation, the cells are plated onto a feeder layer in an appropriate medium. Cells 
containing the construct may be detected by employing a selective medium. After sufficient time 
for colonies to grow, they are picked and analyzed for the occurrence of homologous 
recombinstEon cx saturation of the construct. Those colonies that are positive may then be used 
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5 for embryo manipulation and blastocyst injection. Blastocysts are obtained from 4 to 6 week old 
superovulated females. The ES cdb are trypsinized, and the modified cells are injected into the 
Wastocoel of the blastocyst. After injection, the blastocysts are returned to each uterine horn of 
pseudopregnant females. Females are then allowed to go to term and the resulting litters 
screened for mutant cells having the construct By providing for a different phenotype of the 
1 0 blastocyst and the ES cells, chimeric progeny can be readily detected. 

The chimeric animals are screened for the presence of the modified gene and males and 
females having the modification are mated to produce homozygous progeny. If the gene 
alterations cause lethality at some point in development, tissues or organs can be maintained as 
allogeneic or congenic grafts or transplants, or in in vitro culture. The transgenic animals may 
15 be any non-human mammal, such as laboratory animals, domestic animals, etc. The transgenic 
animals may be used in functional studies, drug screening, etc., e.g. to determine the effect of 
a candidate drug on basal cell carcinomas. 

The subject gene may be employed for producing all or portions of the patched protein. 
For expression, an expression cassette may be employed, providing for a transcriptional and 
20 translational initiation region, which may be inducible or constitutive, the coding region under 
the transcriptional control of the transcriptional initiation region, and a transcriptional and 
translational termination region. Various transcriptional initiation regions may be employed 
which are functional in the expression host. 

Specific pic peptides of interest include the extracellular domains, particularly in the 
25 human mature protein, aa 120 to 437. and aa 770 to 1027. These peptides may be used as 
immunogens to raise antibodies that recognize the protein in an intact cell membrane. The 
cytoplasmic domains, as shown in Figure 2, (the amino terminus and carboxy terminus) are of 
interest in binding assays to detect ligands involved in signaling mediated by pic. 



Printed from Mimosa 01/18/2000 12:20:28 page -17- 



WO 97/45541 PCT/US97/09553 

-16- 

5 The peptide may be expressed in prokaryotes or eukaryotes in accordance with 

conventional ways, depending upon the purpose for expression. For large scale production of 
the protein, a unicellular organism or cells of a higher organism, e.g. eukaryotes such as 
vertebrates, particularly mammals, may be used as the expression host, such as E. coii, B t 
subthis, S. cerevisae, and the like. In many situations, it may be desirable to express the patched 

1 0 gene in a mammalian host, whereby the patched gene will be glycosylated, and transported to 
the cellular membrane for various studies. 

With the availability of the protein in large amounts by employing an expression host, 
the protein may be isolated and purified in accordance with conventional ways. A lysate may be 
prepared of the expression host and the lysate purified using HPLC, exclusion chromatography, 

15 gel electrophoresis, affinity chromatography, or other purification technique. The purified 
protein wiD generally be at least about 80% pure, preferably at least about 90% pure, and may 
be up to and including 100% pure. By pure is intended free of other proteins, as well as cellular 
debris. 

The polypeptide is used for the production of antibodies, where short fragments provide 
20 for antibodies specific for the particular polypeptide, whereas larger fragments or the entire gene 
allow for the production of antibodies over the surface of the polypeptide or protein. Antibodies 
may be raised to the normal or mutated forms of pic- The extracellular domains of the protein 
are of interest as epitopes, particular antibodies that recognize common changes found in 
abnormal, oncogenic ptc t which compromise the protein activity. Antibodies may be raised to 
25 isolated peptides corresponding to these domains, or to the native protein, e.g. by immunization 
with cells expressing p/c, bnrnunization with liposomes having pic inserted in the membrane, etc. 
Antibodies that recognize the extracellular domains of ptc are useful in diagnosis, typing and 
staging of human carcinomas. 
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5 Antibodies are prepared in accordance with conventional ways, where the expressed 

polypeptide or protein may be used as an immunogen, by itself or conjugated to known 
immunogenic carriers, e.g. KLH, pre-S HBsAg, other viral or eukaryotic proteins, or the like. 
Various adjuvants may be employed, with a series of injections, as appropriate, For monoclonal 
antibodies, after one or more booster injections, the spleen may be isolated, the splenocytes 
10 immortalized, and then screened for high affinity antibody binding. The immortalized cells, e.g. 
hybridomas, producing the desired antibodies may then be expanded. For further description, 
see Monoclonal Antibodies- A Laboratory Manual, Harlow and Lane eds., Cold Spring Harbor 
Laboratories, Cold Spring Harbor, New York, 1988. If desired, the MRNA encoding the heavy 
and light chains may be isolated and mutagenized by cloning in E. coli, and the heavy and light 
1 5 chains may be mixed to further enhance the affinity of the antibody. 

The antibodies find particular use in diagnostic assays for developmental abnormalities, 
basal cell carcinomas and other tumors associated with mutations in ptc. Staging, detection and 
typing of tumors may utilize a quantitative immunoassay for the presence or absence of normal 
ptc. Alternatively, the presence of mutated forms of ptc may be determined. A reduction in 
20 normal ptc and/or presence of abnormal ptc is indicative that the tumor is /?fc-associated. 

A sample is taken from a patient suspected of having a />/c-associated tumor, 
developmental abnormality or BCNS. Samples, as used herein, include biological fluids such as 
blood, cerebrospinal fluid, tears, saliva, lymph, dialysis fluid and the like- organ or tissue culture 
derived fluids, and fluids extracted from physiological tissues. Also included in the term are 
25 derivatives and fractions of such fluids. Biopsy samples are of particular interest, e.g. skin 
lesions, organ tissue fragments, etc. Where metastasis is suspected, blood samples may be 
preferred. The number of cells in a sample will generally be at least about 103, usually at least 
904 more usuaBy at fieasi about 105. The cells may be dissociated, in the case of solid tissues, 
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5 or tissue sections may be analyzed. Alternatively a lysate of the cells may be prepared. 

Diagnosis may be performed by a number of methods. The different methods ail 
determine the absence or presence of normal or abnormal ptc in patient ceOs suspected of having 
a mutation in ptc. For example, detection may utilize staining of intact cells or histological 
sections, performed in accordance with conventional methods. The antibodies of interest are 
1 0 added to the cell sample, and incubated for a nerinrf nf 

• I — — -•».» »w MUVTT ISUftUlllg IIP UK 

epitope, usually at least about 10 minutes. The antibody may be labeled with radioisotopes, 
enzymes, fluoresces chemiluminescers, or other labels for direct detection. Alternatively, a 
second stage antibody or reagent is used to amplify the signal. Such reagents are well-known 
in the art. For example, the primary antibody may be conjugated to biotin, with horseradish 
15 peroxidase-coigugated avidin added as a second stage reagent. Final detection uses a substrate 
that undergoes a color change in the presence of the peroxidase. The absence or presence of 
antibody binding may be determined by various methods, including flow cytometry of 
dissociated cells, microscopy, radiography, scintillation counting, etc. 

An alternative method for diagnosis depends on the in vitro detection of binding between 
20 antibodies and ptc in a lysate. Measuring the concentration of ptc binding in a sample or fraction 
thereof may be accomplished by a variety of specific assays. A conventional sandwich type assay 
may be used. For example, a sandwich assay may first attach p/c-specific antibodies to an 
insoluble surface or support. The particular manner of binding is not crucial so long as it is 
compatible with the reagents and overall methods of the invention They may be bound to the 
25 plates covalently ornon-covalently, preferably non-covalently. 

The insoluble supports may be any compositions to which polypeptides can be bound, 
which is readily separated from soluble material, and which is otherwise compatible with the 
overs!! method. The surface of such supports may be solid or porous and of any convenient 
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5 shape. Examples of suitable insoluble supports to which the receptor is bound include beads, e.g. 
magnetic beads, membranes and microtiter plates. These are typically made of glass, plastic (e.g. 
polystyrene), polysaccharides, nylon or nitrocellulose. Microtiter plates are especially convenient 
because a large number of assays can be carried out simultaneously, using small amounts of 
reagents and samples. 

10 Patient sample lysates are then added to separately assayable supports (for example, 

separate wells of a microtiter plate) containing antibodies. Preferably, a series of standards, 
containing known concentrations of norma] and/or abnormal ptc is assayed in parallel with the 
samples or aliquots thereof to serve as controls. Preferably, each sample and standard will be 
added to multiple wells so that mean values can be obtained for each. The incubation time 
1 5 should be sufficient for binding, generally, from about 0. 1 to 3 hr is sufficient. After incubation, 
the insoluble support is generally washed of non-bound components. Generally, a dilute non- 
ionic detergent medium at an appropriate pH, generally 7-8, is used as a wash medium. From 
one to six washes may be employed, with sufficient volume to thoroughly wash nonspecifically 
bound proteins present in the sample. 
20 After washing, a solution containing a second antibody is applied. The antibody will bind 

pic with sufficient specificity such that it can be distinguished from other components present. 
The second antibodies may be labeled to facilitate direct, or indirect quantification of binding. 
Examples of labels that permit direct measurement of second receptor binding include 
radioIabeJs, such aS 3H or 1251, fluoresces, dyes, beads, chemilumninescers, colloidal particles, 
25 and the like. Examples of labels which permit indirect measurement of binding include enzymes 
where the substrate may provide for a colored or fluorescent product. In a preferred 
embodiment, the antibodies are labeled with a covalently bound enzyme capable of providing 
a detestable product signal after addition of suitable substrate. Examples of suitable enzymes 
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5 for use in conjugates include horseradish peroxidase, alkaline phosphatase, malate 
dehydrogenase and the like. Where not commercially available, such antibody-enzyme 
conjugates are readily produced by techniques known to those skilled in the art. The incubation 
time should be sufficient for the labeled ligand to bind available molecules. Generally, from 
about D. 1 to 3 hr is sufficient, usually 1 hr sufficing. 

10 After the second binding step, the insoluble support is again washed free of non- 

specificalry bound material. The signal produced by the bound conjugate is detected by 
conventional means. Where an enzyme conjugate is used, an appropriate enzyme substrate is 
provided so a detectable product is formed. 

Other i mm u n oassa y s are known in the art and may find use as diagnostics. Ouchterlony 

15 plates provide a simple determination of antibody binding. Western blots may be performed on 
protein gels or protein spots on filters, using a detection system specific for ptc as desired, 
conveniently using a labeling method as described for the sandwich assay. 

Other diagnostic assays of interest are based on the functional properties of ptc protein 
itself. Such assays are particularly useful where a large number of different sequence changes 

20 lead to a common phenotype, i.e., loss of protein function leading to oncogenesis or 
developmental abnormality. For example, a functional assay may be based on the transcriptional 
changes mediated by hedgehog and patched gene products. Addition of soluble Hh to 
embryonic stem cells causes induction of transcription in target genes. The presence of 
functional /rfecanbe determined by its ability to antagonize Hh activity. Other functional assays 

25 may detect the transport of specific molecules mediated by ptc, in an intact cell or membrane 
fragment Conveniently, a labeled substrate is used, where the transport in or out of the cell can 
be quanthated by radiography, microscopy, flow cytometry, spectrophotometry, etc. Other 
assays msy detect cc^ormssicml changes, or changes in the subcellular localization of patched 
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5 proton. 

By providing for the production oflarge amounts of patched protein, one can identify 
iigands or substrates that bind to, modulate or mimic the action of patched A common feature 
in basal cdl carcinoma is the loss of adhesion between epidermal and dermal layers, indicating 
a role for ptc in inaintaining appropriate cefl adhesion. Areas of investigation include the 

10 development of cancer treatments, wound healing, adverse effects of aging, metastasis, etc. 

Drug screening identifies agents that provide a replacement for ptc function in abnormal 
cells. The role of ptc as a tumor suppressor indicates that agents which mimic its function, in 
terms of transmembrane transport of molecules, transcriptional down-regulation, etc., will inhibit 
the process of oncogenesis. These agents may also promote appropriate cell adhesion in wound 

15 healing and aging, to reverse the loss of adhesion observed in metastasis, etc. Conversely, agents 
that reverse ptc function may stimulate controlled growth and healing. Of particular interest are 
screening assays for agents that have a low toxicity for human cells. A wide variety of assays 
may be used for this purpose, including labeled in vitro protein-protein binding assays, 
electrophoreuc mobility shift assays, immunoassays for protein binding, and the like. The 

20 purified protein may also be used for determination of three-dimensional crystal structure, which 
can be used for modeling intermolecular interactions, transporter function, etc. 

The term "agent* as used herein describes any molecule, e.g. protein or pharmaceutical, 
with the capability of altering or mimicking the physiological function of patched Generally a 
plurality of assay mixtures are run in parallel with different agent concentrations to obtain a 

25 differential response to the various concentrations. Typically, one of these concentrations serves 
as a negative control, Le. at zero concentration or below the level of detection. 

Candidate agents encompass numerous chemical classes, though typically they are 
organic molecules, preferably small organic compounds having a molecular weight of more than 
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5 50 and less than about 2,500 dahons. Candidate agents comprise functional groups necessary 
for structural interaction with proteins, particularly hydrogen bonding, and typically include at 
least an amine, carbonyi, hydroxyl or carboxyi group, preferably at least two of the functional 
chemical groups. The candidate agents often comprise cyclical carbon or heterocyclic structures 
and/or aromatic or r^ryaromalic structures substituted with one or more of the above functional 

10 groups. Candidate agents are also found among biomolecules including peptides, saccharides, 
ratty 'ds, steroids, purines, pyrirnidines, derivatives, structural analogs or a combinations thereof. 

Candidate agents are obtained from a wide variety of sources including libraries of 
synthetic or natural compounds. For example, numerous means are available for random and 
directed synthesis of a wide variety of organic compounds and biomolecules, including 

1 5 expression of randomized oligonucleotides and oligopeptides. Alternatively, libraries of natural 
compounds in the form of bacterial, fungal, plant and animal extracts are available or readily 
produced Additionally, natural or synthetically produced libraries and compounds are readily 
modified through conventional chemical, physical and biochemical means, and may be used to 
produce combinatorial libraries. Known pharmacological agents may be subjected to directed 

20 or random chemical modifications, such as acyiation, alkyiation, est erifi cation, amidification, etc. 
to produce structural analogs. 

Where the screening assay is a binding assay, one or more of the molecules may be 
joined to a label, where the label can directly or indirectly provide a detectable signal. Various 
labels include radioisotopes, fluorescers, chemflurninescers, enzymes, specific binding molecules, 

25 particles, eg. magnetic particles, and the like. Specific binding molecules include pairs, such as 
biotin and strep tavidin, digoxtn and antidigoxin etc. For the specific binding members, the 
cornptememary member would normally be labeled with a molecule that provides for detection, 
in accordance with kaowm procedures. 
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5 A variety of other reagents may be included in the screening assay. These include 

reagents like salts, neutral proteins, e.g. albumin, detergents, etc that are used to facilitate 
optimal protein-protein binding arri/or reduce nonspecific or background interactions. Reagents 
that improve the efficiency of the assay, such as protease inhibitors, nuclease inhibitors, anti- 
microbial agents, etc. may be used. The mixture of components are added in any order that 

10 provides for the requisite binding. Incubations are performed at any suitable temperature, 
typically between 4° and 40* C. Incubation periods are selected for optimum activity, but may 
also be optimized to facilitate rapid high-throughput screening. Typically between 0.1 and 1 
hours will be sufficient. 

Other assays of interest detect agents that mimic patched function, such as repression 

15 of target gene transcription, transport of patched substrate compounds, etc. For example, an 
expression construct comprising a patched gene may be introduced into a cell line under 
conditions that allow expression. The level of patched activity is determined by a functional 
assay, as previously described. In one screening assay, candidate agents are added in 
combination with a Hh protein, and the ability to overcome Hh antagonism of pic is detected. 

20 In another assay, the ability of candidate agents to enhance pic function is determined. 
Alternatively, candidate agents are added to a cell that lacks functional ptc y and screened for the 
ability to reproduce pic in a functional assay. 

The compounds having the desired pharmacological activity may be administered in a 
physiologically acceptable carrier to a host for treatment of cancer or developmental 

25 abnormalities attributable to a defect in patched function. The compounds may also be used to 
enhance patched function in wound healing, aging, etc. The inhibitory agents may be 
administered in a variety of ways, orally, topically, parenteralry e.g. subcutaneously, 
intr&peritoneally, by viral infection, intravascular^, etc. Topical treatments are of particular 
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5 interest. Depending upon the manner of introduction, the compounds may be formulated in a 
variety of ways. The concentration of therapeutically active compound in the formulation may 
vary from about 0. 1-100 wtVr 

The pharmaceutical compositions can be prepared in various forms, such as granules, 
tablets, pills, suppositories, capsules, suspensions, salves, lotions and the like. Pharmaceutical 
1 0 grade organic or inorganic carriers and/or diluents suitable for oral and topical use can be used 
to make up compositions containing the therapeuticalry-active compounds. Diluents known to 
the art include aqueous mafia, vegetable and animal oils and fats. Stabilizing agents, wetting and 
emulsifying agents, salts for varying the osmotic pressure or buffers for securing an adequate 
pH value, and skin penetration enhancers can be used as auxiliary agents. 
1 5 The gene or fragments thereof may be used as probes for identifying the 5" non-coding 

region comprising the transcriptional initiation region, particularly the enhancer regulating the 
transcription of patched By probing a genomic library, particularly with a probe comprising the 
5* coding region, one can obtain fragments comprising the 5' non-coding region. If necessary, 
one may walk the fragment to obtain further 5' sequence to ensure that one has at least a 
20 functional portion of the enhancer. It is found that the enhancer is proximal to the 5' coding 
rt & on > a Portion being in the transcribed sequence and downstream from the promoter 
sequences. The transcriptional initiation region may be used for many purposes, studying 
embryonic development, provkfing for regulated expression of patched protein or other protein 
of interest during embryonic development or thereafter, and in gene therapy. 
25 The gene may also be used for gene therapy. Vectors useful for introduction of the gene 

include plasmids and viral vectors. Of particular interest are retroviral-based vectors, e.g. 
raoloney murine leukemia virus and modified human irrmumodeficiency virus- adenovirus 
vectors, etc. Gene therapy may be used to areat skin lesions, an affected feaus, e$c. f by 
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5 timnsfection of the normal gene into embryonic stem cells or into other fetal celk A wide variety 
of viral vectors can be employed for transfection and stable integration of the gene into the 
genome of the cell*. Alternatively, micro-injection may be employed, fusion, or the like for 
wtroduction of genes into a suitable host cell. See, for example, Dhawan etal. (1991) Science 
254:1509-1512 and Smith et aL (1 990) Molecular and r dhilar Biology 3268-3271. 
1 0 The following examples are offered by illustration not by way of limitation. 

EXPERIMENTAL 

Methods mdMalcriflh 

PCR on Mosquito (Anopheles gambiae) Genomic DNA. PCR primers were based on 

amino add stretches of Qyptc that were not likely to diverge over evolutionary time and were 

15 of low degeneracy. Two such primers (P2R1 (SEO ID NO-14). 

GGACGAATTCAARGTOrAVrARYTNTfy; p 4 Rl; (SEQ ID NO: 15) 

GGACfiA AHttrrfWr AR A AflrANTC (the underlined sequences are Eco RI linkers) 

amplified an appropriately sized band from mosquito genomic DNA using the PCR. The 

program conditions were as follows: 

20 94°C 4 min.; 72*C Add Taq; 

[49°C 30 sec.; 72'C 90 sec.; 94 °C 15 sec] 3 times 
[94°C 15 sec; 50'C 30 sec.; 72°C 90 sec] 35 times 
72 'C 10 min; 4 *C hold 

25 This band was subdoned into the EcoRV site of pBluescript II and sequenced using the USB 
Sequence kit. 

Serein of a Butterfly cDNA Library with Mosquito PCR Product Using the mosquito 
PCR product (SEQ ID NO:7) as a probe, a 3 day embryonic Precis coenia kgl 1 0 cDNA library 
(generously provided by Sean CarroD) was screened. Filters were hybridized at 65° C overnight 

30 in o solution eosiiaiaing SjsSSC, 10% dextan sulfate, 5x Denhardt's, 200 ug/mi sonicated 
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5 salnwn sperm DNA, and 0.5% SDS. Filters were washed in 0. IX SSC, 0.1% SDS at room 
temperature seven! times to remove nonspecific hybridization. Of the 1 00,000 plaques initially 
screened, 2 overlapping clones, U and U were isolated, which corresjw^^ 
ofbutterflypfc. Using L2 as a probe, the library filters were rescreened and 3 additional clones 
(L5, L7, L8) were isolated which encompassed the remainder of the ptc coding sequence. The 
10 full length sequence of butterfly ptc (SEQ ID N0 3) was determined by ABI automated 
sequencing. 

Screen of a THbolhan (beetle) Genomic Library with Mosquito PCR Product and 900 
bp Fragment Jram the Butterfly Clone. A Ageml 1 genomic library from Tribolium casteneum 
(gift ofRob Dennefl) was probed with a mixture of the mosquito PCR (SEQ ID N0.7) product 
15 and BstXI/EcoRI fragment of L2. filters were hybridized at 55° C overnight and washed as 
above. Of the 75,000 plaques screened, 14 clones were identified and the Sact fragment of T8 
(SEQ ID NO: IX which crosshybridized with the mosquito and butterfly probes, was subcloned 
into pBluescript. 

PCR on Mouse cDN A Using Degenerate Primers Derived from Regions Conserved in 
20 the Four Insect Homologies. Two degenerate PCR primers (P4REV- (SEQ ID NO: 16) 
GGACXtAATTCT INGANTGYTTYTGGGA- P22- (SEQ ID NOm CATACCAOTPAAr, 
CXLGTC1GGCCARTGCAT) were designed based on a comparison of pic amino acid 
sequences from fly (Drosophila melanogaster) (SEQ ED NO: 6), mosquito (Anopheles gambiae) 
(SEQ ID NO:8X butterfly (Precis coenid) (SEQ ID NO:4), and beetle (Tribolium casteneum) 
25 (SEQEDNO^X I represents inosine, which can form base pairs with all four nucleotides. P22 
was used to reverse transcribe RNA from 12.5 dpc mouse limb bud (gift from David Kingsley) 
for 90 mia at 37" C. PCR using P4REV (SEQ ID NO: 17) and P22 (SEQ ED NO: 18) was then 
performed on 1 fA of She resultant cDNA under the following conditions: 
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5 94'C4nun.;72*CAddTtq; 

[94 'C 15 sec.- 50 *C 30 sec.- 72 *C 90 sec.] 35 times 
72 °C 10 rain.-, 4 °C hold 

PGR products of the expected size were subcloned into the TA vector (Invitrogen) 

10 and sequenced with the Sequenase Version 2.0 DNA Sequencing Kit (U. S. B.). 

Using the cloned mouse PCR fragment as a probe, 300,000 plaques of a mouse 8.5 dpc 
XgtlO cDNA library (a gift from Brigid Hogan) were screened at 65° C as above and washed 
in 2x SSC, 0.1% SDS at room temperature. 7 clones were isolated, and three (M2, M4, and 
M8) were subdoned into pBluescript n 200,000 plaques of this library were rcscreened using 

15 first, a 1.1 kb EcoRI fragment from M2 to identify 6 clones (M9-M16) and secondly a mixed 
probe containing the most N terminal (Xhol fragment from M2) and most C terminal sequences 
(BamHI/Bgm fragment fromM9) to isolate 5 clones (M17-M21). M9, M10, M14, and M17- 
21 were subcloned into the EcoRI site of pBluescript II (Strategene). 

RNA Blots and in situ Hybridizations in Whole and Sectioned Mouse Embryos: 

20 Northerns. A mouse embryonic Northern blot and an adult multiple tissue Northern blot 

(obtained from Oontech) were probed with a 900 bp EcoRI fragment from an N terminal coding 
region of mouse pic. Hybridization was performed at 65* C in 5x SSPE, IOx Denhardt's, 100 
pg/ml sonicated salmon sperm DNA, and 2% SDS. After several short room temperature 
washes in 2x SSC, 0.05% SDS, the blots were washed at high stringency in 0. 1 X SSC, 0.1 % 

25 SDS at 50° C. 

In situ hybridization of sections: 7.75, 8.5, 11.5, and 13.5 dpc mouse embryos were 
dissected in PBS and frozen in Tissue-Tek medium at -80° C. 12-16 pm frozen sections were 
cut, collected onto VectaBond (Vector Laboratories) coated slides, and dried for 30-60 minutes 

&t room temperature. After a 10 minute fixation in 4% paraformaldehyde m PBS, the slides 
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5 were washed 3 times for 3 minutes in PBS, acetylated for 10 minutes in 0.25% acetic anhydride 
in triethanolamine, and washed three more times for 5 minutes in PBS. Prehybridization (50% 
formamide, 5X SSC, 250 |tgfal yeast tRNA, 500 ug/ml sonicated salmon sperm DNA, and 5x 
Denharotfs) was carried out for 6 hours at room temperature in 50% formamide/5x SSC 
humidified chambers. The probe, which consisted of 1 kb from the N-terminus of ptc, was 

10 added at a concentration of 200-1000 ng/mJ into the same solution used for prehybridization. 
and then denatured for five minutes at 80° C. Approximately 75 \d of probe were added to 
each slide and covered with Parafiim. The slides were incubated overnight at 65° C in the same 
humidified chamber used previously. The following day, the probe was washed successively in 
5X SSC (5 minutes, 65 # C), 0.2X SSC (1 hour, 65" C), and 0.2X SSC (10 minutes, room 

15 temperature). After five minutes in buffer Bl (0.1M maleic acid, 0.15 M NaCl, pH 7.5), the 
slides were blocked for 1 hour at room temperature in 1% blocking reagent (Boerhinger* 
Mannheim) in buffer Bl, and then incubated for 4 hours in buffer Bl containing the DIG-AP 
conjugated antibody (Boerhmger-Mannheim) at a 1:5000 dilution. Excess antibody was 
removed during two 15 minute washes in buffer Bl, followed by five minutes in buffer B3 (100 
20 mMTris, ItX^NaC^SmMMgCtt, pH9.5). The antibody was detected by adding an alkaline 
phosphatase substrate (350 |d 75 mg/ml X-phosphate in DMF, 450 |il 50 mg/ml NBT in 70% 
DMF in 100 mis of buffer B3) and allowing the reaction to proceed overnight in the dark. After 
a brief rinse in 10 mM Tris, lmMEDTA, pH 8.0, the slides were mounted with Aquamount 
(Lerner Laboratories). 

25 Drosophila ^-transcriptional initiation region fi-gal constructs, A series of constructs 

were designed that link different regions of the ptc promoter from Drosophila to a LacZ 

repeater grate in order to study the cis [regulation of the ptc expression pattern. See Fag. 1 . A 
10.8kb BamM/BspMl fragment comprising the 5-non-coding region of Ohe MRNA at its 3- 
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5 terminus was obtained and truncated by restriction enzyme digestion as shown in Fig. I . These 
expression cassettes were introduced into Drosophila lines using a P-eleraent vector (Thummel 
etaL(l 988) ficnfc_74:445-456X which were injected into embryos, providing flies which could 
be grown to produce embryos. (See Spradling and Rubin (1982) Science 218:341-347 for a 
description of the procedure.) The vector used a pUC8 background into which was introduced 
10 the white gene to provide for yellow eyes, portions of the P-element for integration, and the 
constructs were inserted into ^ from the LacZ gene. The resulting embryos, 

larvae, and adults were stained using antibodies to LacZ protein conjugated to HRP and the 
samples developed with OPD dye to identify the expression of the LacZ gene. The staining 
pattern in embryos is described in Fig. 1, indicating whether there was staining during the early 
15 and late development of the embryo. 

Isolation of a Mouse pic Gene. Homologies of fly pic (SEQ ID NO:6) were isolated 
from three insects: mosquito, butterfly and beetle, using either PCR or low stringency library 
screens. PCR primers to six amino acid stretches of ptc of low mutatability and degeneracy 
were designed. One primer pair, P2 and P4, amplified an homologous fragment of ptc from 
20 mosquito genomic DNA that corresponded to the first hydrophilic loop of the protein. The 
■-_ W5b P product (SEQ ID NO:7) was subcloned and sequenced and when aligned to fly ptc, 
showed 67% amino acid identity. 

The cloned mosquito fragment was used to screen a butterfly A.gt 10 cDNA library. Of 
100,000 plaques screened, five overlapping clones were isolated and used to obtain the full 

r 

25 length coding sequence. The butterfly p/c homologue (SEQ ID NO:4) is 1,3 1 1 amino acids long 
and overall has 50% amino acid identity (72% similarity) to fly ptc. With the exception of a 
divergent C-tenninus, this homology is evenly spread across the coding sequence. Ths 
mosquito PCR dosts (SEQ ID NO:7) and a corresponding fragment of butterfly cDNA wot 
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5 used to screen a beetle AgemU genomic library. Of the plaques screened, 14 clones were 
identified. A fragment of one clone (T8), which hybridized with the original probes, was 
subcloned and sequenced. This 3lcb piece contains an 89 amino acid exon (SEQ ID NO:2) 
which is 44% and 51% identical to the corresponding regions of fly and butterfly ptc 
respectively. 

1 0 Using an alignment of the four insect homologues in the first hydrophflic loop of the ptc % 

two PCR primers were designed to a five and six amino acid stretch which were identical and 
of low degeneracy. These primers were used to isolate the mouse homologue using RT-PCR 
on embryonic Umb bud RNA. An appropriately sized band was amplified and upon cloning and 
sequencing, it was found to encode a protein 65% identical to fly ptc. Using the cloned PCR 

1 5 product and subsequently, fragments of mouse ptc cDN A, a mouse embryonic XcDNA library 
was screened. From about 300,000 plaques, 17 clones were identified and of these, 7 form 
overlapping cDNA's that comprise most of the protein-coding sequence (SEQ ID NO:9) . 

Developmental and Tissue Distribution of Mouse ptc RNA. In both the embryonic and 
adult Northern blots, the ptc probe detects a single 8kb message. Further exposure does not 

20 reveal any additional minor bands. DevdopmentaUy,/wt mRNA is present in low levels as early 
as 7 dpc and becomes quite abundant by 1 1 and 15 dpc. While the gene is still present at 17 
dpc, the Northern blot indicates a dear decrease in the amount of message at this stage. In the 
aduk, ptc RNA is present in high amounts in the brain and rung, as well as in moderate amounts 
in the kidney and liver. Weak signals are detected in heart, spleen, skeletal muscle, and testes. 

25 InsituHybridnatxonofMousepto Northern analysis 

indicates that/tfc mRNA is present at 7 dpc, while there is no detectable signal in sections from 
7,75 dpc emteyos. This discrepancy is explained by the low level of transcription. In contrast, 
pic is jpressBt at high levels along the neural axis of 8.5 dpc embryos. By 1 1.5 dpc, ptc can be 
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5 detected in the developing lung buds and gut, consistent with iu adult Northern profile. In 
addition, the gene is present at high levels in the ventricular zone of the central nervous system, 
as well as in the zona limitans of the prosencephalon, pic is also strongly transcribed in the 
condensing cartilage of 1 1 .5 and 13.5 dpc limb buds, as well as in the ventral portion of the 
somites, a region which is prospective sclerotome and eventually forms bone in the vertebral 
10 column, pic is present in a wide range of tissues from endodermal, mesodermal and ectodermal 
origin supporting its fundamental role in embryonic development. 

Isolation of the Human pic Gene. To isolate human pic (hptc\ 2 x 10 5 plaques from a 
human hing cDNA library (HL3022a, Clonetech) were screened with a Ikbp mouse pic 
fragment, M2-2. Filters were hybridized overnight at reduced stringency (60° C in 5X SSC, 
15 10% dextran sulfite, 5X Denhardrs, 0.2 mg/ml sonicated salmon sperm DNA, and 0.5°/. SDS). 
Two positive plaques (HI and H2) were isolated, the inserts cloned into pBluescript, and upon 
sequencing, both contained sequence highly similar to the mouse pic homolog. To isolate the 
5* end, an additional 6 x10 s plaques were screened in duplicate with M2-3 EcoRI and M2-3 
Xho I (containing 5* untranslated sequence of mouse pic) probes. Ten plaques were purified 
20 and of these, inserts were subcloned into pBluescript. To obtain the full coding sequence, H2 
was fiifly and H14, H20, and H21 were partially sequenced. The 5.1kbp of human pic sequence 
(SEQ ID NO:18) contains an open reading frame of 1447 amino acids (SEQ ID NO:19) that 
is 96% identical and 98% similar to mouse pic. The 5' and 3* untranslated sequences of human 
pic (SEQ ID NO: 18) are also highly similar to mouse pic (SEQ ID NO: 19) suggesting 
25 conserved regulatory sequence. 

Comparison of Mouse, Human, Fly andBuiierffy Sequences. The deduced mouse 
pic protein sequence (SEQ ID NO: 10) has about 38% identical amino acids to fly pic over about 
1,200 amino adds. This amount of conservation is dispersed through much of the pratdn 



Printed from Mimosa 01/18/2000 12:20:28 page -33- 



W0 97/45541 PCT/US97/09553 

-32- 

5 excepting the C-ttraiinal region. The mouse protein also has a 50 amino acid insert relative to 
the fly protein. Based on the sequence conservation of pic and the fimctional conservation of 
hedgehog between fly and mouse, one concludes that pic fimctions similarly in the two 
organisms. A comparison of the amino acid sequences of mouse (mpte) (SEQ ID NO: 10), 
human Opto) (SEQ ID N019X butterfly Qbpic)(SEQ ID NO:4) and drosophila (pic) (SEQ ID 
10 NO:6) is shown in Table 1. 

TABLE 1 

ALIGNMENT OF HUMAN, MOUSE, FLY, AND BUTTERFLY PTC HOMOLOGS 

HPTC HXSAGHXAIPQDR-H^SCCICAMRPAGCCRWWTW 

HPTC MASACNAA GALGRQACCGRRRRTCCPHRA-APDRDrLHRPSyCDA 

n PTC M DRDSLPRVPDTBGD — WDE XLFSDL YI-RTSWVDA 

BPTC KVAPOSSAP5KPRITAAUZ5PCATEA RHSADL YI-RTSWVDA 



* * ** 



Arjtf*QISJ«JKATGRIUU>LWL^^ 

AFALlQISKCKATCJUCAPLWUUU^RIXFKWCYIQJCNCCm^WOLL 
WU>QIDIWKARGSRTAIYUISVFQSHI^T^ 
BPTC AIJa^EI^HIXC^RTSLWIRAWI^ZQLFII^rWDACKVLrVAILVI^ 

** **• * *. .* * ** * . ** * *„»*....* • »**. 



HPTC 
20 MPTC 
PTC 



25 HPTC 
HPTC 
PTC 
BPTC 

30 

HPTC 
MPTC 
PTC 



AKX*BTNVXRLWVBVGGRVSAZX»NYTRQKIGE^ 
AHI*TNV1KLWVBVGCRVSRBI*YTRQKICE£^ 

AQ IflSKV BQLWI QBCCRliK AEIiA YTQKTI OBDB S ATHQLL IQTTHDPKASVLHPQ ^ T T - ft H 

AQIHTRVDQLWATQECORt^AEIJCYTAQAI^EADSSTHQLVIQTAKDPDVSIJ^PCAIXEH 
*. . . . *.-**.. • *# ** ** ** ^ _ . _ . 



. * *** 



II)SAI^RVHVYHYintQWXI^HLCYXSG^ 
I^SAI^RVHVYKYWWOO^EHWYItSOBLIW 
I*roVXATAVKVHLYDTBWUU>MCNM^^ 
BPTC LKVVHAATRVTVHMYD I EVRLKO LCY S PS I POFEG YHH I ES 1 1 DNV I PCA 1 1 TPLDCFWE 
35 • *. * * .. • ... * 



HPTC GAKXQSGTAYLLGKPPLR WTNFDPLEFLEEUt KINYQVDSWBKKLHXAKV 

MPTC OAKLQSOTAYLLGKPPLR WTNFDPLBFLKBUC KIKYQVD5WEEKLNKAEV 

PTC °*QI^^P*SAVVIPGUfQRIXWTTIJ*PASVHQ^ 

40 BPTC QS^-OPDYPIYVPHIJUiXI^JWTHIJ^ KFQFPLSTIEAYMKRAGI 

*- *. . . • ...... 



OHOYMDRPOJIPADK>CPATAPNICKSTKPI4)MALVITOG^ 
CHCYHDRPajnPADPDCPATJtfNITOSTO 
OSGYHBKPCUrPLirPNCPDTAPNXNSTePPW^ 

TSAYMKKPCXDPTDPHCPATAPNKXSGHIPDVAAEt^HGCYGFAAAYMHWPEQLIVOGAT 

^ c ^^^^^^ 

KiPSC K^TCM,V&&HM,Qra^LMT^^ 



HPTC 
MPTC 

45 ptc 

BPTC 
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5 PTC RHRSGHLIUCAQM4SVVQU^ 

BFTC RKS TSAIJlSJ^ALCnVVQIJiGZREMYCYVAO KYTVKQ I OWMQE KAXA VLDXWQRK? AXKV 
•* • *.***.. •** ..**• . * * ...** .*.*•** . * 

BPTC BQSVAQHSTQIC VXXTTTTIUiDIlXSTSDVmiaVASaXl&KLArHC^ 

10 KPTC HQSVAPHSTQX VX^mTTUDDXXJCSrSDVSVIRVASCTIXKIATACLIKLR^ 

PTC EQIXIU^fttJmm>IYV7S5JUUADIIJU^ 

BPTC m-TTSGSVSSAYSFroSTSTU©IIXHCFSBVSI«^ 

*....*.*** **• * . * . * • 

15 HPTC SXSCmVGUUmXVAX^AAOUI^LIGISPKA&TTQVLPPI^ 

KPTC SXSQOAVaUtaVLLVAX^AAGl*^ 

PTC VROC^SVOVA<nnXMCF8TAAGMLSAU£I^ 

BPTC XR3QnGTCXA*r*LIil»3 XTVAA5LG FGaLLG I r 7K A3 5 70 IV? FLAIaSI^v OpKFIX TKTY 

20 

HPTC SKTOOHXRIPFXDRTCKCIJCRTGASV;aT 

KPTC SSTCOHmPPXDRTGBCXJCRTC^VALTSISNVTAPFKAAL 

PTC AB8M IUtXtfnU.IUaCVGPSXIJrSACSTAiMFPAAAPIFVPAI^ 

BPTC VEQACD — VPRZXRTCLVUCKS CLSVLLXSLCNVMAFLAAALLPI P APR VFCLQAA I LLL 

25 

HPTC FHFAXVLLI rPAX LSMDL YRRBDRRLD I FCCFTS PCVSRVI QVB PQAYTDTHDNTRYSP P 

KPTC FWFJUCVIiIFPAII£KDLYRPBDRRIi)IFCCl?SPCVSRVIQ 

PTC SHLAAALLVTPAKI SLDLRRRTAQRAD I FCCCF- PVWKBQPKVAPPVLPLWNKNCR 

30 BPTC PNLGS I LLV7P AM I SLD LRRR&AAP AD LLCCLM- P ESP LPKXRIPER 



HPTC PPysSHSFAHITQXTHQSTVQUtTBYDPHTHVYrTTAEPRSBXSVQPVTVTQOT LSCQSP 

HPTC PPYTSHSFAHXTH XTMQSTVQLRTEYDPHTHVYYTTA2PRS5 X SVQFVTVTQDNLSCQSP 

35 PTC CARHPKSCMHNRVPLPAQNPLLBQPA 

BPTC AKTRJCNDKTHR I D * TTRQPLDP D VS 



HPTC BSTSSTRDIXSQJSOSaUJ<^PPCTKWTLSSFAEim 

40 KPTC MTSSTOIASQFSDSSIflCXJPPCTICWTI^SFAra^ 

PTC 0XPOS6 HSLASP-— SLATPAFQHYTPFLKRSWVKPLTVKOFIAALI 

BPTC INVTJtT — CCL-SV SLTKHAXNQYAPPXHRPAVKVTSKLALIAVIL 



45 HPTC VSLYOTTRVRDCLOLTD IVPRETREYD F I AAQFXY PSFYNKY IVTQXA-D YPM IQHLLYB 

PTC SSWASTRI£Wai>IIDLVFia)SNEHICFU>A^ 

BPTC TSVWCATKVXDGLDLTDIVFEirFDKHBFXtSRQBKYFGFY 

HPTC LHRSPSiman^KI^KNKQI^KKWUnfPRDWW 

50 KPTC LHXBFflinreYVHLIRNEQTJ>QMWIi^ 

PTC YHDSFVRVPHVlJLilWWOLPOFWLLLFSEWLCKLQKI FDIBYRDGRLTKSCKP PNASSDA 

BPTC YHDC^VRXPMXIKHDMGGLTKFWI£I^RDWXU 



55 HPTC VIAXKIXVQTOSRDKPIDISQLTK-QRI'VDAIX3IINPSAPYIYLTAWSKDPVAYAAS^ 

KPTC VIAYKIXVQT08RDKPIDISQLTK^RLVDAIX*IIHPSArYrYXTAWVSKroPVAYAAS^ 

PTC XLATKLXVBTCHVDHPVDKBLVLT'-KRLVHSDa X INQRAF YNYLSAWATHDVFAYGASQG 

BPTC X LAYKUfVBTOHVDNP IDKSL I TAGHRL VDKDG X I HP JCAPYKYLS AWATND ALAYGASQG 

60 
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5 HPTC 
MPTC 
PTC 
BPTC 



10 



15 



HPTC 
MPTC 
PTC 
BPTC 



NIRPHRPSWVHDXADyXPBTRL&IPAABPISYAQFPFTXMLRDTSDPVIAIBKVRTZCS 
NXRPHRPKWVHDKADYMPKTRLRI PAABPIEYAQFPFYLNCLRDTSDFVXAIEKVRVTCN 

KLYPKPRQTFHQPNXY DLKI PKSLPLVYAQMPFYLHCLTDTSQIKTLIOHIRDLSV 

HUCPQPQRWIHSPKDV HLBIKJCSSPLIYTQLPPYLSCLSOTDSIKTLIRSVRDLCL 



MYTSI/tf^STPIWYFFIJVEQYIGLPHWIXLFISVVIJ^ IVKV 
NXTai*a3SYPIIOYPFIJWBQYISLRHWLXI«SV^ 

KWGFct^infPsoiPFiFWKQyHTiJissiJuaLAcnaiJua,^ 

KYIAKGLPNFP BG I PFLFWEQ YL YLRTSLLLALACALG A VF I A VMVLLLN AWAAVLVTLA 



HPTC 
MPTC 
PTC 
20 BPTC 



UUJfTVKXJGHKCLXGXXLSAVPVVILIASTC^ 
VIASLAQXFGAKTIXGXKUAXPAVILXLSVGMMI^FNW^ 

I*ATLVLQIXOVMALI*C VKLS AMPPVLLVLA I G RG VH FTVHLCLG PVTS X GCKRRRAS LAL 



HPTC 
MPTC 

25 ptc 

BPTC 



EHMFAFVLDCAVSTIJjCVLMLACSBFDF IVRY FFAVLAILTILQVTJgGL.VT.T.PVT f ] ' ri 5 yp9 
EHHFAPVXtDOAVSTLLGVLKLACSBFDF I VRY FFAVLAI LTVLGVLKGLVIXPVli*SFFO 
QMSIAPLVHGMLTSGVAVFKI^TSPFBFVIPHPaflXLVVI^ 

KS VLAPWHGALAAALAASHLA . A5E FGFVARLFLRLLLALVTLCL I DGLLFrP IVLS I LO 



HPTC 
30 MPTC 
PTC 
BPTC 



PYPIVSPAKGIJnU*I»TPfiPEPPPSVVRFAMPPCHTHSOSDSSDSEysSQTTVSGLSK-BL 
PCTBVSPANGXJniLPTPSPBPPPSVVllFAVPPCHTHNCSOSSDSKYSSQTTVSOISa-KL 
PKAZLVPLBHPDRISTPSPLPVRSSIOlSOKfiYWQGSRSSRGSC^KSHHHHHKDI^ 
PAASVRFXEHPERLSTPSPKCSPIHPRKSSSSSGCCOKSSRTS-- KSAPRPC APSL 



35 HPTC 
MPTC 
PTC 
BPTC 



40 



HPTC 
MPTC 
PTC 
BPTC 

45 

HPTC 
MPTC 
PTC 
50 BPTC 



RHYKAQQGACGPAHQVIVZATENPVFAHSTVVHPESRHHPPSWPRQQPHU)SOSIJ>POTO 
RQyKAQQGAGCPAHQVXVEATENPVFARSTWHPDSPHQPPLTPRQQPHLOSCSLSPCRQ 
TTITKPQSWKSSNSSIQKPNDWTYQPREQ — RPASYAAPPPAYHKAAAQQHHQHQGPPT 
TTITBBPSSWHSSAHSVQSSMQSIWQPEVWETTTYMOSDSASGRSTPTKSSHGCAITT 



GQQPRRDFPREGLWFPLYRPRRDAFE I STEGUSGPS NRARWG PRGAR5HNPPNFASTAMG 
GQQFRRDFPRBGLRFPP YRPRRDAPE 1 5TECHSCPSKRDR5G PRCARSHNPRNPTSTAMC 

TPPPPFPTA YPPRLQS I WQPEVTVETTHS DS 

TKVTATAMIKVEVVTPSORKSRRSYHYYDRRRDROEDRORDRZRDRORDRDRDRDRDRDR 



SSVPOTCQPITTVTASASVTVAVHPPPVPGPGRNPRCCLCPCY PBTDHGLFKDPHVP 

S6VPSYCQPITTVTASASVTVAVHPP— PGPCRNPRGGPCPOYBSYPETDHGVFBDPHVP 

HT TKVTAT AN I KVELAM P — GPAVRS YNFTS 

DR 



-DRERSRXRORP . DRYRO EPDHPA SPRKNCRDSGHE- 



HPTC 
MPTC 
55 PTC 
BPTC 



FHVRCSRRDSKVEVXELQDVECEERPRGSSSN 
FHVRCKRRDSKVEVXSLQDVECEERPWGSSSN 



SDS5RH 



Th@ Hsufiy of other cIojiss recovered from the mouse library is mi determined. 
These cDNAs cross-Iijybridase with mouse ptc sequence, while differing as to their restriction 
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; - 5 maps. These genes encode a family of proteins related to the patched protein. Alignment of the 

human and mouse nucleotide sequences, which includes coding and noncoding sequence, reveals 
89% identity. 

Radiation hybrid mapping of the human pic gene. Oligonucleotide primers and 
\ I conditions for specifically amplifying a portion of the human /?/c gene from genomic DNA by 

10 the polymerase chain reaction were developed. This marker was designated STS SHGC-8725. 
It generates an amplification product of 196 bp, which is observed by agarose gel 
electrophoresis when o human DNA is used as a template, but not when rodent DNA is used. 
Samples were scored in duplicate for the presence or absence of the 196 bp product in 83 
radiation hybrid DNA samples from the Stanford G3 Radiation Hybrid Panel (purchased from 

15 Research Genetics, Inc.) By comparison of the pattern of G3 panel scores for those with a series 
of Genethon meiotic linkage 5 markers, it was determined that the human pic gene had a two 
point lod score of 1,000 with the meiotic marker D9S287, based on no radiation breaks being 
observed between the gene and the marker in 83 hybrid cell lines. These results indicate that 
the pic gene lies within 50-100 kb of the marker. Subsequent physical mapping in YAC and 

20 BAC clones confirmed this close linkage estimate. Detailed map information can be obtained 
from http://www.shgc.stanford.edu. 

Analysis ofBCNS mutations. The basal cell nevus syndrome has been mapped to the 
| same region of chromosome 9q as was found for pic. An initial screen of EcoRl digested DNA 

from probands of 84 BCNS kindreds did not reveal major rearrangements of the ptc gene, and 

25 so screening was performed for more subtle sequence abnormalities. Using vectorctte PCR, by 
the method according to Riley et aL (1990) N A R, 18:2887-2890, on a BAC that contains 
genomic DNA for the esstire coding region of ptc t the intronic sequence flanMng 20 of the 24 
<2&ms was de&OTtksdL Single strand conformational polymorphism analysis of PCR-ampIified 
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5 DNA from normal individuals, BCNS o patient* and sporadic basal cell carcinomas (BCC) was 
performed for 20 aeons atpte coding sequence. The amplified samples giving abnormal bands 
on SSCP were then sequenced. 

In blood ceO DNA from BCNS individuals, four independent sequence changes were 
found; two in exon 15 and two in exon 1 0. One 49 year old man was found to have a sequence 
10 change in exon IS. His affected sister and daughter have the same alteration, but three 
unafflicted relatives do not. His blood cell DNA has an insertion of 9 base pairs at nucleotide 
2445 of the coding sequence, resulting in the insertion of three amino acids (PNI) after amino 
add 815. Because the normal sequence preceding the insertion is also PNI, a direct repeat has 
been formed. 

15 The second case of an exon 15 change is an 18 year old woman who developed jaw 

cysts at age 9 and BCCs at age 6. The developmental effects together with the BCCs indicate 
that she has BCNS, although none of her relatives are known to have the syndrome. Her Wood 
cell DNA has a deletion of 1 1 bp, removing the sequence ATATCCAGCAC at nucleotides 244 1 
to 2452 of the coding sequence. In addition, nucleotide 2452 is changed from a T to an A. The 

20 deletion results in a frameshift that is predicted to truncate the protein after amino acid 8 1 3 with 
the addition of 9 amino acids. The predicted mutant protein is truncated after the seventh 
transmembrane domain. In Drosophila, a ptc protein that is truncated after the sixth 
transmembrane domain is inactive when ectopically expressed, in contrast to the full-length 
protein, suggesting that the human protein is inactivated by the exon 15 sequence change. The 

25 patient with this mutation is the first affected family member, since her parents, age 48 and 50, 
have neither BCCs nor other signs of the BCNS- DNA from both parents' genes have the normal 
nucleotide s@cpe3tce for eison 15, indicating tthaa the alteration in exon 15 arose in the same 
ge&srstkm as did! ttfes BCNS phenotype. Hence hsr disease is the result of o new imutstion. This 
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5 sequence change is not detected in 84 control chromosomes. 

Analysis of sporadic basal cell carcinomas. To determine whether pic is also 
involved in BCCs that are not associated with the BCNS or germiine changes, DNA was 
examined from 12 sporadic BCCS. Three alterations were found in these tumors. In one tumor, 
a C to T transition in exon 3 at nucleotide 523 of the coding sequence changes a highly 
1 0 conserved leucine to phenylalanine at residue 1 75 in the first putative extracellular loop domain 
Blood cell DNA from the same individual does not have the alteration, suggesting that it arose 
somatically in the tumor. SSCP was used to examine exon 3 DNA from 60 individuals who do 
not have BCNS, and found no changes from the normal sequence. Two other sporadic BCCs 
have deletions o encompassing exon 9 but not extending to exon 8. 
15 The existence of sporadic and hereditary forms of BCCs is reminiscent of the 

characteristics of the two forms of retinoblastoma. This parallel, and the frequent deletion in 
tumors of the copy of chromosome 9q predicted by linkage to cany the wild-type allele, 
demonstrates that the human ptc is a tumor suppressor gene, ptc represses a variety of genes, 
including growth factors, during Drosophila development and may have the same effect in 
20 human skia The often reported targe body size of BCNS patients also could be due to reduced 
ptc function, perhaps due to loss of control of growth fectors. The C to T transition identified 
in ptc in the sporadic BCC is also a common genetic change in the p53 gene in BCC and is 
consistent with the role of sunlight in causing these tumors. By contrast, the inherited deletion 
and insertion mutations identified in BCNS patients, as expected, are not those characteristic 
25 of ultraviolet mutagenesis. 

The identification of the ptc mutations as a cause of BCNS links a large body of 
developmental genetic information to this important human disease. In embjyoa lacking pic 
function part of eacSa body segment is transformed into an antesior-posterior mirror-image 
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5 duplication of another part. The patterning changes in ptc mutants are due in part to 
derepression of another segment polarity gene, wingless, a homolog of the vertebrate Wnt genes 
that encodes secreted signaling proteins. In normal embryonic development, ptc repression of 
wg is relieved by the Hh signaling protein, which emanates from adjacent cells in the posterior 
part of each segment The resulting localized wg expression in each segment primordium 

10 organizes the pattern of bristles on the surface of the animal. The ptc 
transcription, while Hh signaling induces ptc transcription. 

In flies two other proteins work together with Hh to activate target genes: the ser/thr 
laxiasejused and the zinc finger protein encoded by cubitus interrupt™. Negative regulators 
working together vnthptc to repress targets are protein kinase A and costal 2. Thus, mutations 

1 5 that inactivate human versions of protein kinase A or costal!, or that cause excessive activity 
of human hh, gli t or afused homolog, may modify the BCNS phenotype and be important in 
tumorigenesis. 

In accordance with the subject invention, mammalian patched genes, including the 
mouse and human genes, are provided, which can serve many purposes. Mutations in the gene 

20 arc found in patients with basal cell nevus syndrome, and in sporadic basal cell carcinomas. The 
autosomal dominant inheritance of BCNS indicates that patched is a tumor suppressor gene. 
The patched protein may be used in a screening for agonists and antagonists, and for assaying 
for the transcription of ptc mRNA. The protein or fragments thereof may be used to produce 
antibodies specific for the protein or specific epitopes of the protein. In addition, the gene may 

25 be employed for investigating embryonic development, by screening fetal tissue, preparing 
transgenic animals to serve as models, and the like. 

As described above, patients with basal cell nevus syndroms have a high incidence of 
nmltipSe ibasol cell carcinomas, medulloblastomas, and meningiomas. Became somatic ptc 
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5 mutations have been found in sporadic basal cell carcinomas, we have screened for ptc 
mutations in several types of sporadic extracutaneous tumors. We found that 2 of 14 sporadic 
medulloblastomas bear somatic nonsense mutations in one copy of the gene and also deletion 
of the other copy. In addition, we identified mis-sense mutations in ptc in two of seven breast 
carcinomas, one of nine meningiomas, and one colon cancer cell line. No ptc gene mutations 
10 were detected in 10 primary colon carcinomas and eighteen bladder carcinomas. 

BCNS J (OMIM #109400) is a rare autosomal dominant disease with diverse 
phenotypic abnormalities, both tumorous (BCCs, medulloblastomas, and meningiomas) and 
developmental (misshapen ribs, spina bifida occults, and skull abnormalities; Gorlin, RJ.( 1987) 
Medicine 66:98-1 13). The BCNS gene was mapped to chromosome 9q22.3 by linkage analysis 
15 of BCNS families and by LOH analysis in sporadic BCCs (GaUani, MR. et al. (1992) Cell 
69: 111-117). LOH in sporadic medulloblastomas has been reported in the same chromosome 
region (Schofidd, D. etaL (mS) Am J Pathol 146:472-480). Recently, the human homologue 
of the I>osophik patched (VTCtt) gene has been mapped to the BCNS region (Hahn, R etaL 
(1996) Cell 85:841-851; Johnson, R.L. etaL (\996) Science 272:1668-1671; Gallani,MR, et 

20 al (\996)Nat Genet 14:78-81; Xie, J. etaL (1997) Genes Chromosomes Cancer 18:305-309), 
and mutations in this gene have been found in the blood DNA of BCNS patients and in the DNA 
of sporadic BCCs (Hahn, H. et aL, supra; Johnson, R.L. et aL, supra; GaUani, MR. et al, 
supra] and Chidambaram, A etaL (1996) Cancer Res 36:4599-4601). ptc appears to function 
as a tumor suppressor gene; inactivation abrogates hs normal inhibition of the hedgehog 

25 signaling pathway. Because of the wide variety of tumors in patents with the BCNS and wide 
tissue distribution of ptc gene expression, we have begun screening for ptc gene mutations in 
several types of hmmn csjtcens, especially those present in increased numbers in BCNS patients 
(nteduIlcbtaoisjasX ttftoss in sissies derived embryologically from epidermis (breast carcinomas) 
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5 and tho« with chromosome 9q LOG (bladder carcinomas; sec Cairns, P. et aL (1993) Cancer 
Res 53:1230-1232; and Sidransky, D. et aL (1997) NEJM32$mi-nto). 
Materials and methods 

Conical Materials. Diagnoses of ail tumors were confirmed histologically. Cell lines 
were obtained from the America Type Culture CoHectioa DNA was extracted from tumors or 

i n mftfrgKgd norma! tissic (periphen! blood leukocytes or skin) us des^ibed (Cogcn, P.H. ?t o\. 
(1990) Genomics 8279-285; and Sambrook, J. et aL Molecular Cloning: A Laboratory 
Manual Ed. 2, Vol. 2, pp. 9.17 - 9. 19 f Cold Spring Harbor, NY (1989)). 

PCR and Heteroduple* Analysis PCR amplification and heteroduplex/SSCP analysis 
were performed as described (Johnson, R.L. et a/., supra, Spritz, R. A. et aL (1992) Am J Hum 

15 Genet 51:1058-1065). Primers used and intron/exon boundary sequences of the p/c gene were 
derived as reported previously (Johnson, R.L. et a!., supra) and are shown in Table 1 . Primers 
for exon 1 and 2 were from Hahn et al. (supra). 

Sequence Analysis Exon segments exhibiting bands were rearaplified and were 
sequenced directly using the Sequenase sequencing kit according to the protocol recommended 

20 by the niarwrfacturer (United States Biochemical Corp.). A second sequencing was performed 
using independently amplified PCR products to confirm the sequence change. The amplified 
PCR products from each tumor were also cloned into the plasmid vector pCR 2. 1 (InVitrogen), 
followed by sequence analysis of at least four independent clones. The sequence alteration was 
confirmed from at least two independent clones. Simplified amplification of specific allele 

25 analysis was performed according to Let and Hall (Lei, X. and Hall, B.G. (1994) Biotechniques 
16:44-45). 

Allele l^Awtofe Microsatellites used for allelic loss analysis were D9S109, 
PpSl 19, D9S127, D9S196, and D9S287 described m the CHLC human screening set (Research 
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5 Genetics). A part of the ptc intron 1 sequence was tested for polymorphism in i control 
population and found to be polymorphic in 80% of the samples tested. This microsatellite was 
used for analysis of ptc gene allelic loss in bladder carcinomas. The primer sequences are as 
follows: forward primer, 5^CTGAGCAGATTTCCCAGGTC-3 *; and reverse primer, 5'- 
CCTCAGACAGACCTTTCCTC-3'. The PCR cycling for this newly isolated marker was 4 
10 miast95 # £fcltowedby^ PCR 
products were separated on 6% polyacrylamide gels and exposed to film. 
Results and Discussion 

Intronic boundaries were determined for 22 exons of ptc by sequencing vectorette 
PCR products derived from BAC 1 92J22 (Johnson R.L., supra; Table 1 ). Our findings are in 
15 agreement with those of Hahn ei al (supra), expect that we find exon 12 is composed of 2 
separate exons of 126 and 1 19 nucleotides. This indicates that ptc is composed of 23 coding 
exons instead of 22. In addition, we find that exons 3, 4, 10, 1 1, 17, 21, and 23 differ slightly 
in size than reported previously (Hahn et al. t supra). Of 63 rumors studied, 14 were sporadic 
medulloblastomas, and 9 were sporadic meningiomas. These 23 tumors were examined for 
20 allelic deletions by genotyping of tumor and blood DNA with microsatellite markers that flank 
theptogene: D9S119,D9S196, D9S287, D9S127, and D9S109. Four of 14 meduDoblastomas 
had LOR Two of the medulloblastomas, both of which had LOH, had mutations (med34 and 
med36; see Cogen, PJi etal, supra), which are predicted to result in truncated proteins (Table 
2). DNA samples from the blood of these patients lack these mutations, indicating that they 
25 both are somatic mutations. med34 also has allefic loss on 1 7p (Cogen, P.H. et aL t supra). We 
were unable to detect ptc gene mutations by heteroduplex analysis in the other two 
medulloblastomas bearing LOH on 9q. The pathological features of these two tumors differed 
in that med34 Mcngp to the desmoplastic subtype, whereas med36 is of the classic type, 
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in mprfiinohliftnmii are not restricted to a specific subtype. 
TABLE 1 Primer* and boundary stqatncu of PTCH 
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One report (SchoficUL D. n ai, a^ito) has shown that five meduIlobUstomai (two 
25 BCNS- associated cases and three sporadic cases) bearing LOH on chromosome 9q22J-q3 1 are 
all of the desroophflic subtype, suggesting LOH oa 9q22.3 is histological subtype specific. We 
ted that the conclusion derived from only five positive tumors is a not strong one because we 
and others (RiflH, C. ii ol (1997) Cancer Res 57:842-845) have found nondemoplastic 
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5 subtypes of medulloblastomas bearing LOH on chromosome 9q22.3. Independently, another 
group has reported their finding of pic mutations in sporadic medulloblastomas (Raffel, C. et 
al supra). 

A change of T to C at nucleotide 2990 fin exon 1 8) was identified in DNA from one 
of nine sponufic meningiomas, causing a predicted change of codon 997 from De to Thr (Table 

10 2). The meningioma bearing this mutation also has allelic loss on 9q22.3. Blood cdJ DNA is 
heterozygous for this mutation, but DNA from the tumor contains only the mutant sequence. 
Of 100 normal chromosomes examined, none has this sequence change, suggesting that this 
mutation is not likely a common polymorphism. This patient is 84 years old and has had no 
phenotypic abnormalities suggestive of the BCNS, suggesting that this sequence alteration may 

15 not have caused complete inactivation of the pic gene. None of the other eight meningiomas 
had detectable LOH at chromosome 9q. 

TABLE 2 PA TCHED zene alterations' 

T™r Ptthobg Hudmtik Codon Exoo Consequence LOH Mutation Type 

VfadM W iA. llnll Mi w dfareopknk) TCI 869 A 623 H Fmwriuft V*. Scmric 

20 M«06 ttoMW4-n «i(diiric) O2503T D5 15 OtuloSTOf Y« Son** 

Men! Umm&a* T2W0C 997 It Ik to TV Y« Genuine 

Bf349 Bnmmnmam* T2U1C 935 17 TyrWHb Y» Semitic 

&321 BnMicwimM A29750 995 II GfaUGty Ko Somatic 

OO20 Colon tuwraQiM A2000C 667 14 ObloAb No U*«*« 

25 Cot-1 Cok.tc.uno— TtoC bm»IO Wyiwphan No Ocn»&e 

C©13-I Coteojm TWC Into. 10 Wy^rphiwi No Omm*m 

We also examined a variety of other tumors (10 primary tumors and 1 cell line), 18 
bladder tumdrs (14 primary tumors and 4 cell lines), and 2 ovarian cancer cell lines. These 
30 tumors are not known to occur in higher than expected frequency in BCNS patients. We 
identified sequence abnormalities in two breast carcinomas and in the one colon cancer cell line 

(Table 2). The smz&tion found in breast carcinoma Br349 is not present in the patient's normal 
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5 skin DNA, indicating that the sequence change is a somatic mutation. Direct sequencing of the 
PCR product indicated that only the mutant allele is present in the tumor. This mutation 
changes codon 955 from Tyr to His, and this Tyr is conserved in human, murine, chicken, and 
fry /rtfi homologues (Goodrich, L.V. etal. (1996) Genes Dev 10:301-312). The mutation in 
breast carcinoma Br3 21 ts predicted to change codon 995 from Glu to Gry, and the tumor with 

1 0 this mutation retains the wild-type allele. We have sequenced exon 18 in DNA from the blood 
of 50 normal person s and found no changes from the published sequence, suggesting that the 
sequence change found in Br321 is not a common polymorphism. Furthermore, examination 
of the DNA from the cultured skin fibroblasts of the patient did not reveal the same mutation, 
indicating that this is a somatic mutation. 

1 5 Because DNA is not available from normal cells of the patient from which colon cell 

line 320 was estahfishcd, we used simplified amplification of specific allele analysis (Lei, X. and 
Hall, B.G., supra) to examine 50 normal blood DNA samples for the presence of the sequence 
alteration and found none but the DNA from this cell line to have the mutant allele, suggesting 
that this mutation also is unlikely to be a common sequence polymorphism. For bladder 

20 carcinomas, a newly isolated microsatellite that was derived from intron 1 of the pic gene was 
used to examine LOH in the tumor. Three primary bladder carcinomas showed LOH at this 
intragenic locus. With no ptc mutations detected in these tumors, we suspect that the LOH in 
these three bladder carcinomas may reflect the high incidence of while chromosome 9 loss in 
bladder cancers (Sidransky, D. et al., supra). A similar observation has been reported 

25 previously (Simoneau, A. R. etal (1996) Cancer Res 56:5039-5043). 

We also detected a sequence change in intron 10 in two colon carcinomas, 15-1 and 
8-1, m D&esrjSiaa to wss reported previously as a splicing mutation (Unden, A.B. <s$ al (1996) 
Ctwcai? R&s 56:4562-4565). Because we found the same sequence change m obout 20% of 
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5 normal control samples, we suggest that this more Ulcer/ is a nonpathogenic polymorphism. The 
ptc protein is predicted to contain 12 transmembrane domains, two large extracellular loops, and 
one intracellular loop (Goodrich, L.V. et aL t supra). Of the six mutations we identified, four 
are missense mutations. Three mutations lead to amino acid substitutions in the second 
extracellular loop, and one mutation results in an amino acid change in the intracellular domain. 
10 Our data indicate that somatic inactivation of the pic gene does occur in some 

sporadic meduUoblastomas. In addition, because missense mutations of the ptc gene were 
detected in breast carcinomas, we suspect that defects of the ptc function also may be involved 
in some breast carcinomas, although biochemical evidence is necessary to show how these 
missense mutations might impair ptc function. Of 1 1 colon cancers and IS bladder carcinomas 
15 examined, we found only one mutation in 1 colon cell line, suggesting that ptc gene mutations 
are relatively uncornroon in clon and bladder cancers, although the incidence of chromosome 9 
loss in bladder cancers is high (Cairns, P. et al t supra). 

Published reports of SSCP analysis of tumor DNA identified mutations in the ptc gene 
in only 30% of sporadic BCCs, although chromosome 9q22.3 LOH was reported in more than 
20 50% of these tumors (Gallani, M.R. etal, supra). It has been reported that heteroduplex/SSCP 
analysis of gene mutations is more sensitive than SSCP analysis (Spritz, R.A. et a/., supra). In 
our studies, we were able to identify a point mutation in the 310-bp PCR product from exon 15 
using heteroduplex analysis, whereas SSCP analysis failed to reveal this sequence change (Table 
2). Therefore, we suspect that there may be more mutations in BCCs than we have found thus 
25 for. Analysis of the pic gene in BCNS patients and in sporadic BCCs has identified mutations 
scattered widely across the gene, and the majority of mutations were predicted to result in 
truncated proteins (Hahn, R et aL, supra-, Johnson, R.L. et aL t supra; Gallani, M.R. et al. 
supra, Qtkkmbarsm, A. <ztal t supra; Unden, A.B. et a/., supra; Wicking, C. et a!. (1997) Am 
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5 J Hum Genet 60:21-26). In our screening, we found two breast carcinomas bearing missense 
mutations of the pic gene. In one of these two tumors, B349, direct sequencing indicated a 
deletion of the other copy of thopte gene. Any comparison of mutations in akin cancers versus 
exfracutaneous tumors must consider the wholly different causes of these mutational UV light 
is unique to the akin. 

10 All publications and oatent anDticatinn* r\t*A in thU wtf™*;,**. u 

... — — - -r «v iMMHji 

incorporated by reference as if each individual publication or patent o application were 
specifically and individually indicated to be incorporated by reference. 

Although the foregoing invention has been described in some detail by way of 
illustration and example for purposes of clarity of understanding, it will be readily apparent to 
1 5 those of ordinary skill in the art in light of the teachings of this invention that certain changes 
and modifications may be made thereto without departing from the spirit or scope of the 
appended claims. 
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SEQUENCE LISTING 

(i) aeneral informations 

<i) APPLICANT: SCOTT, MATTHEW p. 

GOODRICH, USA y 

Johnson, rckald l. 



(ii) TITLE or INVENTION: PatcW O*o.. and Thair U.« 
15 (iii) NWBER Or SEQUENCES: 19 



(lv) CORRESPONDENCE ADDRESS: 



IA) 
(B) 

20 (c) 



ADDRESSEE: Folay, Boag £ Eliot LLP 
STREET: Oh Post Offioo 8o>ax* 
CITY: Boston 
(D) STATE; MA 

(I) comma : us 

<F) ZIP: 02109 



25 (v) COMPUTER READABLE FORM: 

(A) MEDIUM TIPS: Floppy disk 

(B) COMPUTER: IBM PC ooapa tibia 

(C) OPERATING SYSTEM: FC-DOS/MS-DOS 

30 <D) "OiTHARE: Patantln Ralaa*. #1.0, V.r.ion #1.30 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

(viil) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Vinoant, Matthaw P. 

(B) REGISTRATION NUMBER i 36,709 

(C) REFERENCE/DOCKET NUMBER: SOV003.26 

<ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPH ONE: C17-B32-1000 

(B) TELEFAX: 617-832*7000 

45 <2> INFORMATION FOR SEQ ID NO:l: 

<i) SEQUENCE CHARACTERISTICS : 
(A) LENGTH: 736 btM pairs 
IB) TYPE: nuolaio acid 
50 <C) S TR ANDEDMESS: alngla 

(D) TOPOLOGY: linaax 



35 



40 



55 



(ii) MOLECULE TYPE: DMA (genomic) 
(jci) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 
AACKHCNNTO NATGOCACCC CCNCCCAACC TTTNNNCCNN NTAANCAAAA NNCCCCNTTT 60 

Hamccc? imto Mc ^ 
^ramAcc ecccccaccc WMxzccm mrncc^ccc ceAAanACA actcc^cc ieo 
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AAAATTNANA NAATTGGTCC TAACCTAACC NATNGTTGTT ACGGTTTCCC CCCCCAAATA 24 0 

CATGCACTGG CCCGAACACT TGATCGTTGC CGTTCCAATA AGAATAAATC TGGTCATATT 3C0 

AAACAAGCCN AAAGCTTTAC AAACTGTTGT ACAATTAATG GGCGAACACG AACTGTTCGA 360 

ATTCTGGTCT GGACATTACA AAGTGCACCA CATCGGATGG AACCAGGAGA AGGCCACAAC 420 

CGTACTGAAC GCCTGGCAGA AGAAGTTCGC ACAGGTTGGT GGTTGGCGCA AGGAGTAGAG 4 80 

TGAATGGTGG TAATTTTTGG TTGTTCCAGG AGGTGGATCG TCTGACGAAG AGCAAGAAGT 540 

CGTCGAATTA CATCTTCGTG ACGTTCTCCA CCGCCAATTT GAACAAGATG TTGAAGGAGG 600 

<8^CGTCGAANAC GGACGTGGTG AAGCTGGGGG TGGTGCTGGG GGTGGCGGCG GTGTACGGGT 660 

jSjGCTGGCCCA GTCGGGGCTG GCTGCCTTGG GAGTGCTGGT CTTNGCGNGC TNCNATTCGC 720 
C>£CTATAGTNA GNCGTA 

w 

(2) INFORMATION FOR SEQ ID NO: 2; 

U> SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 107 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



36 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

Xaa Pro Pro Pro Asn Tyr Asn Ser Xaa Pro Lya Xaa Xaa Xaa Leu Val 
1 5 10 15 

Leu Thr Pro Xaa Val Val Thr Val Ser Pro Pro Lys Tyr Met His Trp 
20 25 30 

Pro Glu His Leu He Val Ala Val Pro He Arg He Asn Leu Val lie 
35 40 45 

Leu Asn Lys Pro Lys Ala Leu Gin Thr Val Val Gin Leu Met Gly Glu 
50 55 60 

His Glu Leu Phe Glu Phe Trp Ser Gly His Tyr Lys Val His His He 
65 70 75 80 

Gly Trp Asn Gin Glu Lys Ala Thr Thr Val Leu Asn Ala Trp Gin Lys 
85 90 95 

t 

Lys Phe Ala Gin Val Gly Gly Trp Arg Lys Glu 
100 105 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5187 fooso pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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txi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 

GGGTCTGTCA CCCGGAGCCG GAGTCCCCGG CGGCCAGCAG CGTCCTCGCG AGCCGAGCGC 60 

CCAGGCGCGC CCGGAGCCCG CGGCGGCGGC GGCAACATGG CCTCGGCTGG TAACGCCGCC 120 

GGGGCCCTGG GCAGGCAGGC CGGCGGCGGG AGGCGCAGAC GGACCGGGGG ACCGCACCGC 180 

GCCGCGCCGG ACCGGGACTA TCTGCACCGG CCCAGCTACT GCGACGCCGC CTTCGCTCTG 240 

GAGCAGATTT CCAAGGGGAA GGCTACTGGC CGGAAAGCGC CGCTGTGGCT GAGAGCGAAG 300 

TTTCAGAGAC TCTTATTTAA ACTGGGTTGT TACATTCAAA AGAACTGCGG CAAGTTTTTG 3 60 

GTTGTGGGTC TCCTCATATT TGGGGCCTTC GCTGTGGGAT TAAAGGCAGC TAATCTCGAG 4 20 

ACCAACGTGG AGGAGCTGTG GGTGGAAGTT GGTGGACGAG TGAGTCGAGA ATT AAA T TAT 4 80 

ACCCGTCAGA AGATAGGAGA AGAGGCTATG TTTAATCCTC AACTCATGAT ACAGACTCCA 54 0 

AAAGAAGAAG GCGCTAATGT TCTGACCACA GAGGCTCTCC TGCAACACCT GGACTCAGCA 600 

CTCCAGGCCA GTCGTGTGCA CGTCTACATG TATAACAGGC AATGGAAGTT GGAACATTTG 660 

TGCTACAAAT CAGGGGAACT TATCACGGAG ACAGGTTACA TGGATCAGAT AATAGAATAC 720 

CTTTACCCTT GCTTAATCAT TACACCTTTG GACTGCTTCT GGGAAGGGGC AAAGCTACAG 780 

TCCGGGACAG CATACCTCCT AGGTAAGCCT CCTTTACGGT GGACAAACTT TGACCCCTTG 64 0 

GAATTCCTAG AAGAGTTAAA GAAAATAAAC TACCAAGTGG ACAGCTGGGA GGAAATGCTG 900 

AATAAAGCCG AAGTTGGCCA TGGGTACATG GACCGGCCTT GCCTCAACCC AGCCGACCCA ' 9 60 

GATTGCCCTG CCACAGCCCC TAACAAAAAT TCAACCAAAC CTCTTGATGT GGCCCTTGTT 1020 

TTGAATGGTG GATGTCAAGG TTTATCCAGG AAGTATATGC ATTGGCAGGA GGAGTTGATT 108 0 

GTGGGTGGTA CCGTCAAGAA TGCCACTGGA AAACTTGTCA GCGCTCACGC CCTGCAAACC 1140 

ATCTTCCAGT TAATGACTCC CAAGCAAATG TATGAACACT TCAGGGGCTA CGACTATGTC 1200 

TCTCACATCA ACTGGAATGA AGACAGGGCA GCCGCCATCC TGGAGGCCTG GCAGAGGACT 12 60 

TACGTGGAGG TGGTTCATCA AAGTGTCGCC CCAAACTCCA CTCAAAAGGT GCTTCCCTTC 1320 

ACAACCACGA CCCTGGACGA CATCCTAAAA TCCTTCTCTG ATGTCAGTGT CATCCGAGTG 1380 

GCCAGCGGCT ACCTACTGAT GCTTGCCTAT GCCTGTTTAA CCATGCTGCG CTGGGACTGC 1440 

TCCAAGTCCC AGGGTGCCGT GGGGCTGGCT GGCGTCCTGT TGGTTGCGCT GTCAGTGGCT 1500 

GCAGGATTGG GCCTCTGCTC CTTGATTGGC ATTTCTTTTA ATGCTGCGAC AACTCAGGTT 1560 

TTGCCGTTTC TTGCTCTTGG TGTTGGTGTG GATGATGTCT TCCTCCTGGC CCATGCATTC 1620 

AGTGAAACAG GACAGAATAA GAGGATTCCA TTTGAGGACA GGACTGGGGA GTGCCTCAAG 1680 
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CGCACCGGAG CCA6CGTGGC CCTCACCTCC ATCAGCAATG TCACCGCCTT CTTCATGGCC 
GCATTGATCC CTATCCCTGC CCTGCGAGCG TTCTCCCTCC AGGCTGCTGT GGTGGTGGTA 
TTCAATTTTG CTATGGTTCT GCTCATTTTT CCTGCAATTC TCAGCATGGA TTTATACAGA 
CGTGAGGACA GAAGATTGGA TATTTTCTGC TGTTTCACAA GCCCCTGTGT CAGCAGGGTG 
ATTCAAGTTG AGCCACAGGC CTACACAGAG CCTCACAGTA ACACCCGGTA CAGCCCCCCA 
CCCCCATACA CCAGCCACAG CTTCGCCCAC GAAACCCATA TCACTATGCA GTCCA(fcGTT 
CAGCTCCGCA CAGAGTATGA CCCTCACACG CACGTGTACT ACACCACCGC CGAGCCACGC 2100 
TCTGAGATCT CTGTACAGCC TGTTACCGTC ACCCAGGACA ACCTCAGCTG TCAGAGTCCC 2160 
GAGAGCACCA GCTCTACCAG GGACCTGCTC TCCCAGTICT CAGACTCCAG CCTCCACTGC 
CTCGAGCCCC CCTGCACCAA GTGGACACTC TCTTCGTTTG CAGAGAAGCA CTATGCTCCT 
TTCCTCCTGA AACCCAAAGC CAAGGTTGTG GTAATCCTTC TTTTCCTGGG CTTGCTGGGG 
GTCAOCCTTT ATGGGACCAC CCGAGTGAGA GACGGGCTGG ACCTCACCGA CATTGTTCCC 
CGGGAAACCA GAGAATATGA CTTCATAGCT GCCCAGTTCA AGTACTTCTC TTTCTACAAC 
ATGTATATAG TCACCCAGAA AGCAGACTAC CCGAATATCC AGCACCTACT TTACGACCTT 
CATAAGAGTT TCAGCAATGT GAAGTATGTC ATGCTGGAGG AGAACAAGCA ACTTCCCCAA 
ATGTGGCTGC ACTACTTTAG AGACTGGCTT CAAGGACTTC AGGATGCATT TGACAGTGAC 
TGGGAAACTG GGAGGATCAT GCCAAACAAT TATAAAAATG GATCAGATGA CGGGGTCCTC 
GCTTACAAAC TCCTGGTGCA GACTGGCAGC CGAGACAAGC CCATCGACAT TAGTCAGTTG 
ACTAAACAGC GTCTGGTAGA CGCAGATGGC ATCATTAATC CGAGCGCTTT CTACATCTAC 
CTGACCGCTT GGGTCAGCAA CGACCCTGTA GCTTACGCTG CCTCCCAGGC CAACATCCGG 
CCTCACCGGC CGGAGTGGGT CCATGACAAA GCCGACTACA TGCCAGAGAC CAGGCTGAGA 
ATCCCAGCAG CAGAGCCCAT CGAGTACGCT CAGTTCCCTT TCTACCTCAA CGGCCTACGA 
-SACACCTCAG ACTTTGTGGA AGCCATAGAA AAAGTGAGAG TCATCTGTAA CAACTATACG 
AGCCTGGGAC TGTCCAGCTA CCCCAATGGC TACCCCTTCC TGTTCTGGGA GCAATACATC 
AGCCTGCGCC ACTGGCTGCT GCTATCCATC AGCGTGGTGC TGGCCTGCAC GTTTCTAGTG 
TGCGCAGTCT TCCTCCTGAA CCCCTGGACG GCCGGGATCA TTGTCATGGT CCTGGCTCTG 
ATGACCGTTG AGCTCTTTGG CATGATGGGC CTCATTGGGA TCAAGCTGAG TGCTGTGCCT 3300 
GTGGTCATCC TGATTGCATC TGTTGGCATC GGAGTGGAGT TCACCGTCCA CGTGGCTTTG 3360 
GCCTTTCTCA CAGCCATTGG GGACAAGAAC CACAGGGCTA TGCTCGCTCT GGAACACATG 
TTTGCTCCCG TTCTGGACGG TGCTGTGTCC ACTCTGCTGG GTGTACTGAT GCTTGCAGGG 
TCCGAATTTG ATTTCATTGT CAGATACTTC TTTGCCGTCC TGGCCATTCT CACCGTCTTG 
GGGGTTCTCA ATGGACTGGT TCTGCTGCCT GTCCICTTAT CCTTCTTTGG ACCGTGTCCT 



1740 

ieoo 

I860 
1920 
198S 
2040 



2220 

2280 

2340 

2 4 C : 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

312C 

3180 

3240 



3420 
3460 
3540 
3600 
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GAGGTGTCTC CAGCCAATGG CCTAAACCGA CTGCCCACTC CTTCGCCTGA GCCGCCTCCA 3 660 

AGTGTCGTCC GGTTTGCCGT GCCTCCTGGT CACACGAACA ATGGGTCTGA TTCCTCCGAC 3720 

TCGGAGTACA GCTCTCAGAC CACGGTGTCT GGCATCAGTG AGGAGCTCAG GCAATACGAA 37 80 

GCACAGCAGG GTGCCGGAGG CCCTGCCCAC CAAGTGATTG TGGAAGCCAC AGAAAACCCT 384 0 

GTCTTTGCCC GGTCCACTGT GGTCCATCCG GACTCCAGAC ATCAGCCTCC CTTGACCCCT 3900 

CGGCAACAGC CCCACCTGGA CTCTGGCTCC TTGTCCCCTG GACGGCAAGG CCAGCAGCCT 39 60 

CGAAGGGATC CCCCTAGAGA AGGCTTGCGG CCACCCCCCT ACAGACCGCG CAGAGACGCT 4 020 

TTTGAAATTT CTACTGAAGG GCATTCTGGC CCTAGCAATA GGGACCGCTC AGGGCCCCGT 4080 

GGGGCCCGTT CTCACAACCC TCGGAACCCA ACGTCCACCG CCATGGGCAG CTCTGTGCCC 4140 

AGCTACTGCC AGCCCATCAC CACTGTGACG GCTTCTGCTT CGGTGACTGT TGCTGTGCAT 4200 

CCCCCGCCTG GACCTGGGCG CAACCCCCGA GGGGGGCCCT GTCCAGGCTA TGAGAGCTAC 4 2 6C 

CCTGAGACTG ATCACGGGGT ATTTGAGGAT CCTCATGTGC CTTTTCATGT CAGGTGTGAG 4 3DC 

AGGAGGGACT CAAAGGTGGA GGTCATAGAG CTACAGGACG TGGAATGTGA GGAGAGGCCG 4 380 

TGGGGGAGCA GCTCCAACTG AGGGTAATTA AAATCTGAAG CAAAGAGGCC AAAGATTGGA 4 4 40 

AAGCCCCGCC CCCACCTCTT TCCAGAACTG CTTGAAGAGA ACTGCTTGGA ATTATGGGAA 4 50 0 

GGCAGTTCAT TGTTACTGTA ACTGATTGTA TTATTKKGTG AAATATTTCT ATAAATATTT 4 560 

AARAGGTGTA CACATGTAAT ATACATGGAA ATGCTGTACA GTCTATTTCC TGGGGCCTCT 4 620 

CCACTCCTGC CCCAGAGTGG GGAGACCACA GGGGCCCTTT CCCCTGTGTA CATTGGTCTC 4 6b0 

TGTGCCACAA CCAAGCTTAA CTTAGTTTTA AAAAAAATCT CCCAGCATAT GTCGCTGCTG 4740 

CTTAAATATT GTATAATTTA CTTGTATAAT TCTATGCAAA TATTGCTTAT GTAATAGGAT 4 800 

TATTTGTAAA GGTTTCTGTT TAAAATATTT TAAATTTGCA TATCACAACC CTGTGGTAGG 4 86 0 

ATGAATTGTT ACTGTTAACT TTTGAACACG CTATGCGTGG TAATTGTTTA ACGAGCAGAC 4 920 

ATGAAGAAAA CAGGTTAATC CCAGTGGCTT CTCTAGGGGT AGTTGTATAT GGTTCGCATG 4 980 

GGTGGATGTG TGTGTGCATG TGACTTTCCA ATGTACTGTA TTGTGGTTTG TTGTTGTTGT 504 0 

TGCTGTTGTT GTTCATTTTG GTGTTTTTGG TTGCTTTGTA TGATCTTAGC TCTGGCCTAG 5100 

GTGGGCTGGG AAGGTCCAGG TCTTTTTCTG TCGTGATGCT GGTGGAAAGG TGACCCCAAT 5160 

CATCTGTCCT ATTCTCTGGG ACTATTC 51B7 

(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS: 

<A) LENGTH: 1311 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 
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(ii) MOLECULE TYPE: protein 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Met Val Ala Pro Asp Ser Glu Ala Pro Ser Asn Pro Arg lie Thr Ala 
1 - 5 io is 

Ala His Glu Ser Pro Cys Ala Thr Glu Ala Arg His Ser Ala Asp Leu 
20 25 30 

Tyr lie Arg Thr Ser Trp Val Asp Ala Ala Leu Ala Leu Ser Glu Leu 
35 40 45 

Glu Lys Gly Asn lie Glu Gly Gly Arg Thr Ser Leu Trp He Aro Ala 
50 55 60 

Trp Leu Gin Glu Gin Leu Phe He Leu Gly Cys Phe Leu Gin Gly Asp 
65 70 75 80 

Ala Gly Lys Val Leu Phe Val Ala He Leu Val Leu Ser Thr Phe Cys 
85 90 95 

Val Gly Leu Lys Ser Ala Gin He His Thr Arg Val Asp Gin Leu Trp 
100 105 no 

Val Gin Glu Gly Gly Arg Leu Glu Ala Glu Leu Lys Tyr Thr Ala Gin 
115 120 125 

Ala Leu Gly Glu Ala Asp Ser Ser Thr His Gin Leu Val He Gin Thr 
130 135 140 

Ala Lys Asp Pro Asp Val Ser Leu Leu His Pro Gly Ala Leu Leu Glu 
n5 150 155 160 

His Leu Lys Val Val His Ala Ala Thr Arg Val Thr Val His Met Tyr 
165 170 175 

Asp He Glu Trp Arg Leu Lys Asp Leu Cys Tyr Ser Pro Ser He Pro 
1B0 185 190 

Asp Phe Glu Gly Tyr His His He Glu Ser He He Asp Asn Val He 
195 200 205 

Pro Cys Ala He He Thr Pro Leu Asp Cys Phe Trp Glu Gly Ser Lys 
210 215 220 

Leu Leu Gly Pro Asp Tyr Pro He Tyr Val Pro His Leu Lys His Lys 
225 230 235 240 

Leu Gin Trp Thr His Leu Asn Pro Leu Glu Val Val Glu Glu Val Lys 
245 250 255 

Lys Leu Lys Phe Gin Phe Pro Leu Ser Thr He Glu Ala Tyr Met Lys 
260 265 270 

Arg Ala Gly He Thr Sec Ala Tyr Met Lys Lys Pro Cys Leu Asp Pro 
275 280 265 

Thr Asp Pro His Cys Pro Ala Thr Ala Pro Asn Lys Lys Ser Gly His 
290 295 300 
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He Pro Asp Val Ala Ala Glu Leu Ser His Gly Cys Tyr Gly Phe Ala 
305 310 315 320 

Ala Ala Tyr Met His Trp Pro Glu Gin Leu lie Val Gly Gly Ala Thr 
325 330 335 

Arg Asn Ser Thr Ser Ala Leu Arg Lys Ala Arg Xaa Leu Gin Thr Val 
340 345 350 

Val Gin Leu Met Gly Glu Arg Glu Met tyr Glu Tyr Trp Ala Asp His 
355 360 365 

Tyr Lys Val His Gin lie Gly Trp Asn Gin Glu Lys Ala Ala Ala Val 
37 0 375 380 



Arg juys me Ala Ala Glu Val Arg ,Lys lie Thr 
385 390 395 400 

Thr Ser Gly Ser Val Ser Ser Ala Tyr Ser Phe Tyr Pro Phe Ser Thr 
405 410 415 

Ser Thr Leu Asn Asp He Leu Gly Lys Phe Ser Glu Val Ser Leu Lys 
420 425 430 

Asn He He Leu Gly Tyr Met Phe Met Leu He Tyr Val Ala Val Thr 
435 440 445 

Leu He Gin Trp Arg Asp Pro He Arg Ser Gin Ala Gly Val Gly He 
450 455 460 

Ala Gly Val Leu Leu Leu Ser He Thr Val Ala Ala Gly Leu Gly Phe 
465 470 475 480 

Cys Ala Leu Leu Gly He Pro Phe Asn Ala Ser Ser Thr Gin He Val 
485 490 495 

Pro Phe Leu Ala Leu Gly Leu Gly Val Gin Asp Met Phe Leu Leu Thr 
500 505 510 

His Thr Tyr Val Glu Gin Ala Gly Asp Val Pro Arg Glu Glu Arg Thr 
515 520 525 

Gly Leu Val Leu Lys Lys Ser Gly Leu Ser Val Leu Leu Ala Ser Leu 
530 535 54Q 

Cys Asn Val Met Ala Phe Leu Ala Ala Ala Leu Leu Pro lie Pro Ala 
545 550 555 « 560 

Phe Arg Val Phe Cys Leu Gin Ala Ala He Leu Leu Leu Phe Asn Leu 
565 570 575 

Gly Ser lie Leu Leu val Phe Pro Ala Met He Ser Leu Asp Leu Arg 



580 585 



590 



Arg Arg Ser Ala Ala Arg Ala Asp Leu Leu Cys Cys Leu Met Pro Glu 
595 600 605 

Ser Pro Leu Pro Lys Lys Lys He Pro Glu Arg Ala Lys Thr Arg Lys 

bl ° 615 $20 

Asn Asp Lys Thr His Arg He Asp Thr Thr Arg Gin Pro Leu Asp Pro 

635 640 



"5 630 
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Asp Val Ser Glu Asn Val Thr Lys Thr Cys Cys Leu Ser Val Ser Leu 
*45 650 655 

Thr Lys Trp Ala Lys Aan Gin Tyr Ala Pro Phe lie Met Arg Pro Ala 
660 665 670 

Val Lys Val Thr Ser Met Leu Ala Leu He Ala Val He Leu Thr Ser 
675 660 685 

Val Trp Gly Ala Thr Lys Val Lys Asp Gly Leu Asp Leu Thr J Asp He 
690 695 700 

Val Pro Glu Aan Thr Asp Glu His Glu Phe Leu Ser Arg Gin Glu Lys 
705 710 715 720 

Tyr Phe Gly Phe Tyr Asn Met Tyr Ala Val Thr Gin Gly Asn Phe Glu 
72 5 730 735 

Tyr Pro Thr Asn Gin Lys Leu Leu Tyr Glu Tyr His Asp Gin Phe Val 
74 <> 745 750 

Arg He Pro Asn He He Lys Asn Asp Asn Gly Gly Leu Thr Lys Phe 
75 5 760 765 

Trp Leu Ser Leu Phe Arg Asp Trp Leu Leu Asp Leu Gin Val Ala Phe 
770 775 780 

Asp Lys Glu Val Ala Ser Gly Cys He Thr Gin Glu Tyr Trp Cys Lys 
785 7 90 795 800 

Asn Ala Ser Asp Glu Gly He Leu Ala Tyr Lys Leu Met Val Gin Thr 
005 eiO 815 

Gly His Val Asp Asn Pro He Asp Lys Ser Leu lie Thr Ala Gly His 
920 825 830 

Arg Leu Val Asp Lys Asp Gly He He Asn Pro Lys Ala Phe Tyr Asn 
83 5 840 845 

Tyr Leu Ser Ala Trp Ala Thr Asn Asp Ala Leu Ala Tyr Gly Ala Ser 
850 855 860 

Gin Gly Asn Leu Lys Pro Gin Pro Gin Arg Trp He His Ser Pro Glu 
865 870 875 880 

Asp Val His Leu Glu lie Lys Lys Ser Ser Pro Leu He Tyr Thr Gin 
8B5 ego 895 

Leu Pro Phe Tyr Leu Ser Gly Leu Ser Asp Thr Xaa Ser He Lys Thr 
900 905 910 

Leu He Arg Ser Val Arg Asp Leu Cys Leu Lys Tyr Glu Ala Lys Gly 
915 920 925 

Leu Pro Asn Phe Pro Ser Gly lie Pro Phe Leu Phe Trp Glu Gin Tyr 
930 935 940 

Leu Tyr Leu Arg Thr Ser Leu Leu Leu Ala Leu Ala Cys Ala Leu Ala 
945 950 955 96Q 

Ala Val Phe He Ala Val Met Val Leu Leu Leu Asn Ala Trp Ala Ala 
965 970 975 
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Val Leu Val Thr Leu Ala Leu Ala Thr Leu Val Leu Gin Leu Leu Gly 
980 985 990 

val Met Ala Leu Leu Gly Val Ly, Leu Ser Ala Met Pro Ala Val Leu 
995 1000 1005 

Leu Va^Leu Ala lie Gly Ar, Gly Val Hi, Phe Thr Val Hi, Leu Cy, 

1015 iQ20 

Leu Gly P he Val Thr Ser He Gly Cya Ly, Arg Arg Arg Ala, Ser Leu 

1030 1035 10-10 

Ala Leu Glu Ser Val Leu Ala Pro Val Val Hi, Gly Ala Leu Ala Ala 

1045 "50 10S 5 

Ala Leu Ala a l » c« - ^ , 

""" 106o"" * WW ^ U AX * SSr GIu Cys Gl y Phe Val Ala 

1065 1070 

Arg Leu Ph^Leu Arg Leu Leu Le^Asp He Val Phe Le^Gly Leu He 

Asp Gly Leu Leu Phe Phe Pro He v.! Leu Ser lie Leu Gly Pro Ala 

1095 1100 
Ala Glu Val Arg Pro He Glu Hi, Pro Glu Arg Leu Ser Thr Pro Ser 
1110 "IS 112 

Pr ° LVS CyS S " ff° Ile «" Arg Ly, Ser Ser Ser Ser Ser Gly 
"25 1130 ll3s 

Gly Gly A,p Ly, ser Ser Arg Thr Ser Ly, Ser Ala Pro Arg Pro Cy S 

1145 1150 

Ala Pro Ser Leu Thr Thr He Thr Glu Glu Pro Ser Ser Trp Hi, Ser 

1160 n65 
Ser Ma o „ ls Ser Val Gln ^ Ser ^ ^ ^ ^ ^ ^ ^ ^ 



20 



1180 



neV 81 G1U TIL Thr Th * ^ *» «y A,p ser Ala Ser 

1190 "95 1200 

Gly Arg Sec Tht p Iht ^ Ser ^ ^ ^ ^ ^ ^ ^ 

1205 1210 ins 

Thr Ly, Val Tht Ala Thr Ala A,n He Ly, Val Glu Val Val Thr Pro 

1225 1230 
Ser Asp Ar^Ly, Ser Arg Arg Ser Tyr Hi, Tyr Tyr Asp Arg Arg Arg 

Asp Arg^Asp Glu A,p Arg Asp Arg A,p Arg Glu Arg Asp Arg Asp Arg 

1255 1260 
A3P s Arg Asp Arg Asp Arg Asp Ar, A,p Arg A,p Arg A,p Arg A, P Arg 

1275 1280 
Glu Arg Ser Arg Glu Arg A,p Arg Arg A,p Arg Tyr Arg Asp Glu Arg 

1290 1295 

A,P His Arg Ala Ser Pro Arg Glu Ly, Arg Gin Arg Phe Trp 

1305 



Thr 
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(2) INFORMATION FOR SEQ ID NO:5: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 4434 base pairs 

(B) TYPE: nucleic acid 

(C) ST RAND ED NESS : single 
<D) TOPOLOGY: linear 

(ii)- MOLECULE TYPE: cDNA 



(xi) SEQUENCE DESCRIPTION: SEQ ID N0:5: 



CGAAAC AAG A 




GAG7AGGGAG 


AGCGTCTGTG 


TTGX'GtGttG 




60 


ACGCACACAG 


GCGCAAAACA 


GTGCACACAG 


ACGCCCGCTG 


GGCAAGAGAG 


AGTGAGAGAG 


120 


AGAAACAGCG 


GCGCGCGCTC 


GCCTAATGAA 


GTTGTTGGCC 


TGGCTGGCGT 


GCCGCATCCA 


180 


CG AG AT AC AG 


ATACATCTCT 


CATGGACCGC 


GACAGCCTCC 


CACGCGTTCC 


GGACACACAC 


240 


GGCGATGTGG 


TCGATGAGAA 


ATTATTCTCG 


GATCTTTACA 


TACGCACCAG 


CTGGGTGGAC 


300 


r. : .VJAAGTGG 


CGCTCGATCA 


GATAGATAAG 


CGCAAAGCGC 


GTGGCAGCCG 


CACGGCGATC 


2 L 2 


TATCTGCGAT 


CAGTATTCCA 


GTCCCACCTC 


GAAACCCTCG 


GCAGCTCCGT 


GCAAAAGCAC 


420 


GCGGGCAAGG 


TGCTATTCGT 


GGCTATCCTG 


GTGCTGAGCA 


CCTTCTGCGT 


CGGCCTGAAG 


480 


AOCCjCCCAGA 


TCCACTCCAA 


GGTGCACCAG 


CTGTGGATCC 


AGGAGGGCGG 


CCGGCTGGAG 


540 


GCGGAACTGG 


CCTACACACA 


GAAGACGATC 


GGCGAGGACG 


AGTCGGCCAC 


GCATCAGCTG 


600 


CTCATTCAGA 


CGACCCACGA 


CCCGAACGCC 


TCCGTCCTGC 


ATCCGCAGGC 


GCTGCTTGCC 


660 


CACCTGGAGG 


TCCTGGTCAA 


GGCCACCGCC 


GTCAAGGTGC 


ACCTCTACGA 


CACCGAATGG 


720 


GGGCTGCGCG 


ACATGTGCAA 


CATGCCGAGC 


ACGCCCTCCT 


TCGAGGGCAT 


CTACTACATC 


780 


GAGCAGATCC 


TGCGCCACCT 


CATTCCGTGC 


TCGATCATCA 


CGCCGCTGGA 


CTGTTTCTGG 


840 


GAGGGAAGCC 


AGCTGTTGGG 


TCCGGAATCA 


GCGGTCGTTA 


TACCAGGCCT 


CAACCAACGA 


900 


CTCCTGTGGA 


CCACCCTGAA 


TCCCGCCTCT 


GTGATGCAGT 


ATATGAAACA 


AAAGATGTCC 


960 


GAGGAAAAGA 


TCAGCTTCGA 


CTTCGAGACC 


GTGGAGCAGT 


ACATGAAGCG 


TGCGGCCATT 


1020 


GGCAGTGGCT 


ACATGGAGAA 


GCCCTGCCTG 


AACCCACTGA 


ATCCCAATTG 


CCCGGACACG 


1080 


GCACCGAACA 


AGAACAGCAC 


CCAGCCGCCG 


GATGTGGGAG 


CCATCCTGTC 


CGGAGGCTGC 


1140 


TACGGTTATG 


CCGCGAAGCA 


CATGCACTGG 


CCGGAGGAGC 


TGATTGTGGG 


CGGACGGAAG 


1200 


AGGAACCGCA 


GCGGACACTT 


GAGGAAGGCC 


CAGGCCCTGC 


AGTCGGTGGT 


GCAGCTGATG 


1260 


ACCGAGAAGG 


AAATGTACGA 


CCAGTGGCAG 


GACAACTACA 


AGGTGCACCA 


TCTTGGATGG 


1320 


ACGCAGGAGA 


AGGCAGCGGA 


GGTTTTGAAC 


GCCTGGCAGC 


GCAACTTTTC 


GCGGGAGGTG 


1380 


GAACAGCTGC 


TACGTAAACA 


GTCGAGAATT 


GCCACCAACT 


ACGATATCTA 


CGTGTTCAGC 


1440 
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TCGGCTGCAC TGGATGACAT CCTGGCCAAG TTCTCCCATC CCAGCGCCTT GTCCATTGTC 1500 

ATCGGCGTGG CCGTCACCGT TTTGTATGCC TTTTGCACGC TCCTCCGCTG GAGGGACCCC 15 60 

GTCCGTGGCC AGAGCAGTGT GGGCGTGGCC GGAGTTCTGC TCATGTGCTT CAGTACCGCC 1620 

GCCGGATTGG GATTGTCAGC CCTGCTCGGT ATCGTTTTCA ATGCGCTGAC CGCTGCCTAT 1680 

GCGGAGAGCA ATCGGCGGGA GCAGACCAAG CTGATTCTCA AGAACGCCAG CACCCAGGTG 174 0 

GTTCCGTTTT TGGCCCTTGG TCTGGGCGTC GATCACATCT TCATAGTGGG ACCGAGCATC 1800 

CTGTTCAGTG CCTGCAGCAC CGCAGGATCC TTCTTTGCGG CCGCCTTTAT TCCGGTGCCG I8 60 

GCTTTGAAGG TATTCTGTCT GCAGGCTGCC ATCGTAATGT GCTCCAATTT GGCAGCGGCT 1920 

CTATTGGTTT TTCCGGCCAT GATTTCGTTG GATCTACGGA GACGTACCGC CGGCAGGGCG 1980 

GACATCTTCT GCTGCTGTTT TCCGGTGTGG AAGGAACAGC CGAAGGTGGC ACCTCCGGTG 204 0 

CTGCCGCTGA ACAACAACAA CGGGCGCGGG GCCCGGCATC CGAAGAGCTG CAACAACAAC 210 0 

AGGGTGCCGC TGCCCGCCCA GAATCCTCTG CTGGAACAGA GGGCAGACAT CCCTGGGAGC 21 6C 

AGTCACTCAC TGGCGTCCTT CTCCCTGGCA ACCTTCGCCT TTCAGCACTA CACTCCCTTC 222 0 

CTCATGCGCA GCTGGGTGAA GTTCCTGACC GTTATGGGTT TCCTGGCGGC CCTCATATCC 22 80 

AGCTTGTATG CCTCCACGCG CCTTCAGGAT GGCCTGGACA TTATTGATCT GGTGCCCAAG 23 4 0 

GACAGCAACG AGCACAAGTT CCTGGATGCT CAAACTCGGC TCTTTGGCTT CTACAGCATG 24 0 0 

TATGCGGTTA CCCAGGGCAA CTTTGAATAT CCCACCCAGC AGCAGTTGCT CAGGGACTAC 2 4 60 

CATGATTCCT TTGTGCGGGT GCCACATGTG ATCAAGAATG ATAACGGTGG ACTGCCGGAC 2 52 0 

TTCT3GCTGC TGCTCTTCAG CGAGTGGCTG GGTAATCTGC AAAAGATATT CGACGAGGAA 2S£C 

TACCGCGACG GACGGCTGAC CAAGGAGTGC TGGTTCCCAA ACGCCAGCAG CGATGCCATC 2 640 

CTGGCCTACA AGCTAATCGT GCAAACCGGC CATGTGGACA ACCCCGTGGA CAAGGAACTG 27 00 

GTGCTCACCA ATCGCCTGGT CAACAGCGAT GGCATCATCA ACCAACGCGC CTTCTACAAC 27 60 

TATCTGTCGG CATGGGCCAC CAACGACGTC TTCGCCTACG GAGCTTCTCA GGGCAAATTG 2 820 

TATCCGGAAC CGCGCCAGTA TTTTCACCAA CCCAACGAGT ACGATCTTAA GATACCCAAG 2B8 0 

AGTCTGCCAT TGGTCTACGC TCAGATGCCC TTTTACCTCC ACGGACTAAC AGATACCTCG 2 94 0 

CAGATCAAGA CCCTGATAGG TCATATTCGC GACCTGAGCG TCAAGTACGA GGGCTTCGGC 3000 

CTGCCCAACT ATCCATCGGG CATTCCCTTC ATCTTCTGGG AGCAGTACAT GACCCTGCGC 3060 

TCCTCACTGG CCATGATCCT GGCCTGCGTG CTACTCGCCG CCCTGGTGCT GGTCTCCCTG 3120 

CTCCTGCTCT CCGTTTGGGC CGCCGTTCTC GTGATCCTCA GCGTTCTGGC CTCGCTGGCC 3180 

CAGATCTTTG GGGCCATGAC TCTGCTGGGC ATCAAACTCT CGGCCATTCC GGCAGTCATA 324 0 

CTCATCCTCA GCGTGGGCAT GATGCTGTGC TTCAATGTGC TGATATCACT GGGCTTCATG 3300 

ACATCCGTTG GCAACCGACA GCGCCGCGTC CAGCTGAGCA TGCAGATGTC CCTGGGACCA 3 3 6C 
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CTTGTCCACG GCATGCTGAC CTCCGGAGTG GCCGTGTTCA TGCTCTCCAC CTCGCCCTTT 3420 
GAGTTTGTGA TCCGGCACTT CTGCTGGCTT CTGCTGGTGG TCTTATGCGT TGGCGCCTGC 3480 

AACAGCCTTT TGGTGTTCCC CATCCTACTG AGCATGGTGG GACCGGAGGC GGAGCTGGTG 3540 

CCGCTGGAGC ATCCAGACCG CATATCCACG CCCTCTCCGC TGCCCGTGCG CAGCAGCAAG 3600 

AGATCGGGCA AATCCTATGT GGTGCAGGGA TCGCGATCCT CGCGAGGCAG CTGCCAGAAG 36 60 

TCGCATCACC ACCACCACAA AGACCTTAAT GATCCATCGC TGACGACGAT CACCGAGGAG 3720 

CCGCAGTCGT GGAAGTCCAG CAACTCGTCC ATCCAGATGC CCAATGATTG GACCTACCAG 3780 

CCGCGGGAAC AGCGACCCGC CTCCTACGCG GCCCCGCCCC CCGCCTATCA CAAGGCCGCC 3840 

GCCCAGCAGC ACCACCAGCA TCAGGGCCCG CCCACAACGC CCCCGCCTCC CTTCCCGACG 3900 

GCCTATCCGC CGGAGCTGCA GAGCATCGTG GTGCAGCCGG AGGTGACGGT GGAGACGACG 3960 

CACTCGGACA GCAACACCAC CAAGGTGACG GCCACGGCCA ACATCAAGGT GGAGCTGGCC 4 020 

ATGCCCGGCA GGGCGGTGCG CAGCTATAAC TTTACGAGTT AGCACTAGCA CTAGTTCCTG 4 08C 

TAGCTATTAG GACGTATCTT TAGACTCTAG CCTAAGCCGT AACCCTATTT GTATCTGTAA 4140 

AATCOATTTG TCCAGCGGGT CTGCTGAGGA TTTCGTTCTC ATGGATTCTC ATGGATTCTC 4 2CC 

ATGGATGCTT AAATGGCATG GTAATTGGCA AAATATCAAT TTITGTGTCT CAAAAAGATG 4260 
CATTAGCTTA TGGTTTCAAG ATACATTTTT AAAGAGTCCG CCAGATATTT ATATAAAAAA 
AATCCAAAAT CGACGTATCC ATGAAAATTG AAAAGCTAAG CAGACCCGTA TGTATGTATA 
TGTGTATGCA TGTTAGTTAA TTTCCCGAAG TCCGGTATTT ATAGCAGCTG CCTT 
(2) INFORMATION FOR SEQ ID NO: 6: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1285 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 
<D> TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



432C 
4380 
4434 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 6: 

Met Asp Arg Asp Ser Leu Pro Arg Val Pro Asp Thr His Gly Asp Val 



10 is 



Val Asp Glu Lys Leu Phe Ser Asp Leu Tyr lie Arg Thr Ser Trp Val 
20 25 30 

Asp Ala Gin Val Ala Leu Asp Gin lie Asp Lys Gly Lys Ala Arg Gly 
J5 40 45 

Ser Arg Thr Ala lie Tyr Leu Arg Ser Val Phe Gin Ser His Leu Glu 
50 55 go 
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Thr Leu Gly Ser Ser Val Gin Lys Hia Ala Gly Lys Val Leu Phe Val 
" 70 " 80 

Ala He Leu Val Leu Ser Thr Phe Cys Val Gly Leu Lys Ser Ala Glr 
85 90 g5 

He Hi, ser Lys Val His Gin Leu Trp lie Gin Glu Gly Gly Arg Leu 
100 105 no 

Glu Ala Glu Leu Ala Tyr Thr Gin Lys Thr He Gly Glu Asp Glu Ser 
ila 120 125 

Ala Thr His Gin Leu Le U He Gln xbr Thr His Asp pro Ajm M# ^ 
w 135 no 

Val Leu His Pro Gin Ala Leu Leu Ala His Leu Glu Val Leu Val Lys 

'« 160 

Ala Thr Ala Val Lys Val His Leu Tyr Asp Thr Glu Trp Gly Leu Arg 
16S 170 its 

Asp Met Cys Asn Met Pro Ser Thr Pro Ser Phe Glu Gly He Tyr Tyr 
180 IBS 190 

lie Glu Gin He Leu Arg His Leu He Pro Cys Ser He He Thr Pro 
195 200 205 

Leu Asp Cys Phe Trp Glu Gly Ser Gin Leu Leu Gly Pro Glu Ser Ala 
"° 215 220 

Val Val He Pro Gly Leu Asn Gin Arg Leu Leu Trp Thr Thr Leu Asn 
230 235 240 

Pro Ala Ser Val Met Gin Tyr Met Lys Gin Lys Met Ser Glu Glu Lys 
245 250 255 



He Ser Phe Asp Phe Glu Thr Val Glu Gin Tyr Met Lys Arg 

260 ■* - - 



265 270 



Ala Ala 



Ue Gly ser Gly Tyr Met Glu Lys Pro Cys Leu Asn Pro Leu Asn Pro 
275 280 2 8 5 

Asn Cy, Pro Asp Thr Ala Pro Asn Lys Asn Ser Thr Gin Pro Pro Asp 

295 300 

Val G!y Ala He Leu Ser Gly Gly Cys Tyr Gly Tyr Ala Ala Lys His 
310 315 320 



Met His Trp Pro Glu Glu Leu He Val Gly Gly Arg Lys Arg A,n Arg 
325 330 335 * 

Ser Gly His Leu Arg Lys Ala Gin Ala Leu Gin Ser Val Val Gin Leu 

' 34 5 350 

Met Thr Glu Lys Glu Met Tyr Asp Gin Trp Gin Asp Asn Tyr Lys ' Va 3 



365 



His His Leu Gly Trp Thr Gin Glu Lys Ala Ala Glu Val Leu Asn Ala 

373 380 

365 f |J At9 G1U Val Giu ?i? 1"» *u Arg Lys Gin 



395 



400 
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Ser Arg lie Ala Thr Asn Tyr Asp He Tyr Val Phe Ser Ser Ala Ala 
405 410 415 

Leu Asp Asp He Leu Ala Lys Phe Ser His Pro Ser Ala Leu Ser lie 
420 425 430 

Val lie Gly Val Ala Val Thr Val Leu Tyr Ala Phe Cys Thr Leu Leu 
435 44Q 445 

Arg Trp Arg Asp Pro Val Arg Gly Gin Ser Ser Val Gly Val Ala Gly 
450 455 460 

Val Leu Leu Met Cys Phe Ser Thr Ala Ala Gly Leu Gly Leu Ser Ala 
465 47 ° 475 480 

Leu Leu Gly lie Val Phe Asn Ala Leu Thr Ala Ala Tyr Ala Glu Ser 
««5 490 495 

Asn Arg Arg Glu Gin Thr Lys Leu lie Leu Lys Asn Ala Ser Thr Gin 
500 505 510 

Val Val Pro Phe Leu Ala Leu Gly Leu Gly Val Asp His He Phe He 
515 520 525 

Val Gly Pro Ser lie Leu Phe Ser Ala Cys Ser Thr Ala Gly Ser Phe 
530 535 540 

Phe Ala Ala Ala Phe lie Pro Val Pro Ala Leu Lys Val Phe Cys Leu 
550 555 560 

Gin Ala Ala He Val Met Cys Ser Asn Leu Ala Ala Ala Leu Leu Val 
565 570 575 

Phe Pro Ala Met He Ser Leu Asp Leu Arg Arg Arg Thr Ala Gly Arq 
5 *0 585 590 

Ala Asp He Phe Cys Cys Cys Phe Pro Val Trp Lys Glu Gin Pro Lys 
595 600 605 

Val Ala Pro Pro Val Leu Pro Leu Asn Asn Asn Asn Gly Arg Gly Ala 
610 615 620 

Arg His Pro Lys Ser Cys Asn Asn Asn Arg Val Pro Leu Pro Ala Gin 
625 630 635 640 

Asn Pro Leu Leu Glu Gin Arg Ala Asp He Pro Gly Ser Ser His Set 
645 650 655 

Leu Ala Ser Phe Ser Leu Ala Thr Phe Ala Phe Gin His Tyr Thr Pro 
660 665 670 

Phe Leu Met Arg Ser Trp Val Lys Phe Leu Thr Val Met Gly Phe Leu 

ws 680 685 

Ala Ala Leu He Ser Ser Leu* Tyr Ala Ser Thr Arg Leu Gin Asp Gly 
690 695 700 

Leu Asp lie He Asp Leu Val Pro Lys Asp Ser Asn Glu His Lys Phe 
705 710 715 7 20 

Leu Asp Ala Gin Thr Arg Leu Phe Gly Phe Tyr Ser Met Tyr Ala Val 
725 730 735 

Thr Gin Gly Asn Phe Glu Tyr Pro Thr Gin Gin Gin Leu Leu Arg Asp 
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7 <0 745 750 

Tyr Hi 3 Asp Ser Phe Arg Val Pro His Val He Lys Asn Asp Asn Gly 
7 S5 760 765 

Gly Leu Pro Asp Phe Trp Leu Leu Leu Phe Ser Glu Trp Leu Gly Asn 
77 0 775 780 

Leu Gin Lys He Phe Asp Glu Glu Tyr Arg Asp Gly Arg Leu Thr Lys 
785 790 795 800 

Glu Cys Trp Phe Pro Asn Ala Ser Ser Asp Ala He Leu Ala Tyr Lys 
605 810 815 

Leu Tie Val Gin Thr Gly His Val Asp Asn Pro Val Asp Lys Glu Leu 
B20 825 B30 

Val Leu Thr Asn Arg Leu Val Asn Ser Asp Gly He He Asn Gin Arg 
fl 35 840 845 

Ala Phe Tyr Asn Tyr Leu Ser Ala Trp Ala Thr Asn Asp Val Phe Ala 
850 855 860 

Tyr Gly Ala Ser Gin Gly Lys Leu Tyr Pro Glu Pro Arg Gin Tyr Phe 
865 S70 875 880 

Hia Gin Pro Asn Glu Tyr Asp Leu Lys He Pro Lys Ser Leu Pro Leu 
885 890 895 

Val Tyr Ala Gin Met Pro Phe Tyr Leu His Gly Leu Thr Asp Thr Ser 
900 gos 910 

Gin lie Lys Thr Leu' lie Gly His He Arg Asp Leu Ser Val Lys Tyr 
915 920 925 

Glu Gly Phe Gly Leu Pro Asn Tyr Pro Ser Gly He Pro Phe He Phe 
930 935 940 

Trp Glu Gin Tyr Met Thr Leu Arg Ser Ser Leu Ala Met He Leu Ala 
945 950 955 960 

Cys Val Leu Leu Ala Ala Leu Val Leu Val Ser Leu Leu Leu Leu Ser 
965 970 975 

Val Trp Ala Ala Val Leu Val lie Leu Ser Val Leu Ala Ser Leu Ala 
980 985 990 

Gin He Phe Gly Ala Met Thr Leu Leu Gly He Lys Leu Ser Ala He 
995 1000 1005 

Pr ° ?i?„ Vai Ile Leu Ile Leu Ser Val G1 V Met M «t Leu Cys Phe Asn 
101° 1015 1020 

Val Leu Ile Ser Leu Gly Phe Met Thr Ser Val Gly Asn Arg Gin- Arg 
1025 1030 1035 1040 

Arg Val Gin Leu Ser Met Gin Met Ser Leu Gly Pro Leu Val His Gly 
1045 1050 1055 

Met Leu Thr Ser Gly Val Ala Val Phe Met Leu Ser Thr Ser Pro Phe 
1060 1065 1070 

Glu Phe Val Ho Arg His Phe Cys Trp Leu Leu Leu Val Val Leu Cys 
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1075 1080 1085 

Val Gly Ala Cys Asn Ser Leu Leu Val Phe Pro He Leu Leu Ser Met 
1090 1095 HOO 

Val Gly Pro Glu Ala Glu Leu Val Pro Leu Glu His Pro Asp Arg lie 
1105 1H0 ins 1120 

Ser Thr Pro Ser Pro Leu Pro Val Arg Ser Ser Lys Arg Ser Gly Lys 
1125 H30 1135 

Ser Tyr Val Val Gin Gly Ser Arg Ser Ser Arg Gly Ser Cys Gin Lys 
1140 H45 1150 

Ser His His His His His Lys Asp Leu Asn Asp Pro Ser Leu Thr Thr 
1155 1160 1165 

He Thr Glu Glu Pro Gin Ser Trp Lys Ser Ser Asn Ser Ser He Gin 
1170 ins lieo 

Met Pro Asn Asp Trp Thr Tyr Gin Pro Arg Glu Gin Arg Pro Ala Ser 
1185 1190 H95 1200 

Tyr Ala Ala Pro Pro Pro Ala Tyr His Lys Ala Ala Ala Gin Gin His 
1205 1210 1215 

His Gin His Gin Gly Pro Pro Thr Thr Pro Pro Pro Pro Phe Pro Thr 
1220 1225 1230 

Ala Tyr Pro Pro Glu Leu Gin Ser He Val Val Gin Pro Glu Val Thr 
1235 1240 1245 

Val Glu Thr Thr His Ser Asp Ser Asn Thr Thr Lys Val Thr Ala Thr 
1250 1255 1260 

Ala Asn He Lys Val Glu Leu Ala Met Pro Gly Arg Ala Val Arg Ser 
1265 1270 1275 12B0 

Tyr Asn Phe Thr Ser 
12B5 

(2) INFORMATION FOR SEQ ID NO : 7 : 

(i) SEQUENCE CHARACTERISTICS : 
— <A) LENGTH: 345 base pairs 

<B) TYPE: nucleic acid 

(C) STRANDED NESS : single 

(D) TOPOLOGY: linear 

<ii> MOLECULE TYPE: DNA <genomic) 



<xi) SEQUENCE DESCRIPTION: SEQ ID NO: 7: 

AAGGTCCATC AGCTTTGGAT ACAGGAAGGT GGTTCGCTCG AGCATGAGCT AGCCTACACG 6C 

CAGAAATCGC TCGGCGAGAT GGACTCCTCC ACGCACCAGC TGCTAATCCA AACNCCCAAA 120 

GATATGGACG CCTCGATACT GCACCCGAAC GCGCTACTGA CGCACCTGGA CGTGGTGAAG 180 

AAAGCGATCT CGGTGACGGT GCACATGTAC GACATCACGT GGAGNCTCAA GGACATGTGC 2 40 
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TACTCGCCCA GCATACCGAG NTTCGATACG CACTTTATCG AGCAGATCTT CGAGAACATC 300 
ATACCGTGCG CGATCATCAC GCCGCTGGAT TGCTTTTGGG AGGGA 345 
i2) INFORMATION FOR SEQ ID NO:8: 

(i) SEQUENCE CHARACTERISTICS ■ 

(A) LENGTH: 115 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



t*i) SEQUENCE DESCRIPTION: SEQ ID NO:8: 

Lys val His Gin Leu Trp He Gin Glu Gly Gly Ser Leu Glu H.s Glu 

10 15 

Leu Ala Tyr Thr Gin Lys Ser Leu Gly Glu Met A*p Ser Ser Thr H i5 

25 30 
Gin Le U Leu Ile Gin Thr Pro Lys A,p Met Asp Ala Ser lie Leu Hi, 

40 45 
Pro Asn Ala Leu Leu Thr His Leu A Sp Val Val Lys Ly, Ala Ue Ser 

55 60 
Val Thr Val His Met Tyr Asp lle Thr Trp Xaa Leu Lys Asp Met Cys 

75 80 
Tyr Ser Pro Ser Ue Pro Xaa Pfle Asp Th/ His phe Qlu ^ ^ 

Phe Glu as„ zie „. Pro C ys Ala lie lie Thr Pro Leu Asp Cys Phe 

105 110 

Trp Glu Gly 
115 

<2) INFORMATION FOR SEQ ID NO: 9: 

(i) SEQUENCE CHARACTERISTICS • 

(A) LENGTH: 5187 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



(xii SEQUENCE DESCRIPTION; SEQ ID NO: 9: 
r^TCTGTCA CCCGGAGCCG GAGTCCCCGG CGGCCAGCAG CGTCCTCGCG AGCCGAGCGC 
^AGGCGCCC CCGGAGCCCG CGGCGGCGGC GGCAACATGG CCTCGGCTGG TAACGCCGCC 



ec 
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GGGGCCCTGG GCAGGCAGGC CGGCGGCGGG AGGCGCAGAC GGACCGGGGG ACCGCACCGC 180 

GCCGCGCCGG ACCGGGACTA TCTGCACCGG CCCAGCTACT GCGACGCCGC CTTCGCTCTG 24 0 

GAGCAGATTT CCAAGGGGAA GGCTACTGGC CGGAAAGCGC CGCTGTGGCT GAGAGCGAAG 300 

TTTCAGAGAC TCTTATTTAA ACTGGGTTGT TACATTCAAA AGAACTGCGG CAAGTTTTTG 36C 

GTTGTGGGTC TCCTCATATT TGGGGCCTTC GCTGTGGGAT TAAAGGCAGC TAATCTCGAG 420 

ACCAACGTGG AGGAGCTGTG GGTGGAAGTT GGTGGACGAG TGAGTCGAGA ATTAAATTAT 480 

ACCCGTCAGA AGATAGGAGA AGAGGCTATG TTTAATCCTC AACTCATGAT ACAGACTCCA 54 0 

AAAGAAGAAG GCGCTAATGT TCTGACCACA GAGGCTCTCC TGCAACACCT GGACTCAGCA 600 

CTCCAGGCCA GTCGTGTGCA CGTCTACATG TATAACAGGC AATGGAAGTT GGAACATTTG 660 

TGCTACAAAT CAGGGGAACT TATCACGGAG ACAGGTTACA TGGATCAGAT AATAGAATAC 720 

C7TTACCCTT GCTTAATCAT TACACC7TTG GACTGCTTCT GGGAAGGGGC AAAGCTACAG 7 80 

T-::rGGGACAG CATACCTCCT AGGTAAGCCT CCTTTACGGT GGACAAACTT TGACCCCTTG 6 40 

GAATTCCTAG AAGAGTTAAA GAAAATAAAC TACCAAGTGG ACAGCTGGGA GGAAATGCTG 900 

AATAAAGCCG AAGTTGGCCA TGGGTACATG GACCGGCCTT GCCTCAACCC AGCCGACCCA 960 

GATTGCCCTG CCACAGCCCC TAACAAAAAT TCAACCAAAC CTCTTGATGT GGCCCTTGTT 102 0 

TTGAATGGTG GATGTCAAGG TTTATCCAGG AAGTATATGC ATTGGCAGGA GGAGTTGATT 108 0 

GTGGGTGGTA CCGTCAAGAA TGCCACTGGA AAACTTGTCA GCGCTCACGC CCTGCAAACC 1140 

ATGTTCCAGT T AATG ACT CC CAAGCAAATG TATGAACACT TCAGGGGCTA CGACTATGTC 1200 

TCTCACATCA ACTGGAATGA AGACAGGGCA GCCGCCATCC TGGAGGCCTG GC AG AGG ACT 12 60 

TACGTGGAGG TGGTTCATCA AAGTGTCGCC CCAAACTCCA CTCAAAAGG7 GCTTCCCTTC 132 0 

ACAACCACGA CCCTGGACGA CATCCTAAAA TCCTTCTCTG ATGTCAGTGT CATCCGAGTG 1380 

GCCAGCGGCT ACCTACTGAT GCTTGCCTAT GCCTGTTTAA CCATGCTGCG CTGGGACTGC 14 40 

TCCAAGTCCC AGGGTGCCGT GGGGCTGGCT GGCGTCCTGT TGGTTGCGCT GTCAGTGGCT 1500 

GCAGGATTGG GCCTCTGCTC CTTGATTGGC ATTTCTTTTA ATGCTGCGAC AACTCAGGTT 15 60 

TTGCCGTTTC TTGCTCTTGG TGTTGGTGTG GATGATGTCT TCCTCCTGGC CCATGCATTC 1620 

AGTGAAACAG GACAGAATAA GAGGATTCCA TTTGAGGACA GGACTGGGGA GTGCCTCAAG I 68 0 

CGCACCGGAG CC'AGCGTGGC CCTCACCTQC ATCAGCAATG TCACCGCCTT CTTCATGGCC 1740 

GCATTGATCC CTATCCCTGC CCTGCGAGCG TTCTCCCTCC AGGCTGCTGT GGTGGTGGTA 1800 

TTCAATTTTG CTATGGTTCT GCTCATTTTT CCTGCAATTC TCAGCATGGA TT TAT AC AG A 18 60 

CGTGAGGACA GAAGATTGGA TATTTTCTGC TGTTTCACAA GCCCCTGTGT CAGCAGGGTG 1920 

ATTCAAGTTG AGCCACAGGC CTACACAGAG CCTCACAGTA ACACCCGGTA CAGCCCCCCA 1980 

CCCCCATACA CCAGCCACAG CTTCGCCCAC GAAACCCATA TCACTATGCA GTCCACCGTT 204 0 
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CAGCTCCGCA CAGAGTATGA CCCTCACACG CACGTGTACT ACACCACCGC CGAGCCACGC 2100 
TCTGAGATCT CTGTACAGCC TGTTACCGTC ACCCAGGACA ACCTCAGCTG TCAGAGTCCC 2160 
GAGAGCACCA GCTCTACCAG GGACCTGCTC TCCCAGTTCT CAGACTCCAG CCTCCACTGC 2220 
CTCGAGCCCC CCTGCACCAA GTGGACACTC TCTTCGTTTG CAGAGAAGCA CTATGCTCCT 22 BC 
TTCCTCCTGA AACCCAAAGC CAAGGTTGTG GTAATCCTTC TTTTCCTGGG CTTGCTGGGG 2340 
GTCAGCCTTT ATGGGACCAC CCGAGTGAGA GACGGGCTGG ACCTCACGGA CATTGTTCCC 24 00 
CGGGAAACCA GAGAATATGA CTTCATAGCT GCCCAGTTCA AGTACTTCTC TTTCTACAAC 24 60 
ATGTATATAG TCACCCAGAA AGCAGACTAC CCGAATATCC AGCACCTACT TTACGACCTT 2520 
CATAAGAGTT TCAGCAATGT GAAGTATGTC ATGCTGGAGG AGAACAAGCA ACTTCCCCAA 2580 
ATGTGGCTGC ACTACTTTAG AGACTGGCTT CAAGGACTTC AGGATGCATT TGACAGTGAC 2 64 0 
TGGGAAACTG GGAGGATCAT GCCAAACAAT TATAAAAATG GATCAGATGA CGGGGTCCTC 27 00 
GCTTACAAAC TCCTGGTGCA GACTGGCAGC CGAGACAAGC CCATCGACAT TAGTCAGTTG 27 60 
ACTAAACAGC GTCTGGTAGA CGCAGATGGC ATCATTAATC CGAGCGCTTT CTACATCTAC 2 820 
CTGACCGCTT GGGTCAGCAA CGACCCTGTA GCTTACGCTG CCTCCCAGGC CAACATCCGG 2880 
CCTCACCGGC CGGAGTGGGT CCATGACAAA GCCGACTACA TGCCAGAGAC CAGGCTGAGA 294 0 
ATCCCAGCAG CAGAGCCCAT CGAGTACGCT CAGTTCCCTT TCTACCTCAA CGGCCTACGA 3000 
GACACCTCAG ACTTTGTGGA AGCCATAGAA AAAGTGAGAG TCATCTGTAA CAACTATACG 30 60 

AGCCTGGGAC TGTCCAGCTA CCCCAATGGC TACCCCTTCC TGTTCTGGGA GCAATACATC 3120 

AGCCTGCGCC ACTGGCTGCT GCTATCCATC AGCGTGGTGC TGGCCTGCAC GTTTCTAGTG 3180 

TGCGCAGTCT TCCTCCTGAA CCCCTGGACG GCCGGGATCA TTGTCATGGT CCTGGCTCTG 324 0 

ATGACCGTTG AGCTCTTTGG CATGATGGGC CTCATTGGGA TCAAGCTGAG TGCTGTGCCT 3300 

GTGGTCATCC TGATTGCATC TGTTGGCATC GGAGTGGAGT TCACCGTCCA CGTGGCTTTG 3360 

GCCTTTCTGA CAGCCATTGG GGACAAGAAC CACAGGGCTA TGCTCGCTCT GGAACACATG 3420 

TTTGCTCCCG TTCTGGACGG TGCTGTGTCC ACTCTGCTGG GTGTACTGAT GCTTGCAGGG 3 4 80 

TCCGAATTTG ATTTCATTGT CAGATACTTC TTTGCCGTCC TGGCCATTCT CACCGTCTTG 354 0 

GGGGTTCTCA ATGGACTGGT TCTGCTGCCT GTCCTCTTAT CCTTCTTTGG ACCGTGTCCT 3600 

GAGGTGTCTC CAGCCAATGG CCTAAACCOA CTGCCCACTC CTTCGCCTGA GCCGCCTCCA 3660 

AGTGTCGTCC GGTXTGCCGT GCCTCCTGGT CACACGAACA ATGGGTCTGA TTCCTCCGAC 3720 

TCGGAGTACA GCTCTCAGAC CACGGTGTCT GGCATCAGTG AGGAGCTCAG GCAATACGAA 3780 

GCACAGCAGG GTGCCGGAGG CCCTGCCCAC CAAGTGATTG TGGAAGCCAC AGAAAACCCT 3840 

GTCTTTGCCC GGTCCACTGT GGTCCATCCG GACTCCAGAC ATCAGCCTCC CTTGACCCCT 3900 
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CGGCAACAGC CCCACCTGGA CTCTGGCTCC TTGTCCCCTG GACGGCAAGG CCAGCAGCCT 3960 

CGAAGGGATC CCCCTAGAGA AGGCTTGCGG CCACCCCCCT ACAGACCGCG CAGAGACGCT 4 020 

TTTGAAATTT CTACTGAAGG GCATTCTGGC CCTAGCAATA GGGACCGCTC AGGGCCCCGT 4 080 

GGGGCCCGTT CTCACAACCC TCGGAACCCA ACGTCCACCG CCATGGGCAG CTCTGTGCCC 414 0 

AGCTACTGCC AGCCCATCAC CACTGTGACG GCTTCTGCTT CGGTGACTGT TGCTGTGCAT 4 200 

CCCCCGCCTG GACCTGGGCG CAACCCCCGA GGGGGGCCCT GTCCAGGCTA TGAGAGCTAC 4260 

CCTGAGACTG ATCACGGGGT ATTTGAGGAT CCTCATGTGC CTTTTCATGT CAGGTGTGAG 4320 

AGGAGGGACT CAAAGGTGGA GGTCATAGAG CTACAGGACG TGGAATGTGA GGAGAGGCCG 4 3B0 

TGGGGGAGCA GCTCCAACTG AGGGTAATTA AAATCTGAAG CAAAGAGGCC AAAGATTGGA 4 44 0 

AAGCCCCGCC CCCACCTCTT TCCAGAACTG CTTGAAGAGA ACTGCTTGGA ATTATGGGAA 4500 

GGCAGTTCAT TGTTACTGTA ACTGATTGTA TTATTKKGTG AAATATTTCT ATAAATATTT 45 60 

AARAGGTGTA CACATGTAAT ATACATGGAA ATGCTGTACA GTCTATTTCC TGGGGCCTCT 4 620 

CCACTCCTGC CCCAGAGTGG GGAGACCACA GGGGCCCTTT CCCCTGTGTA CATTGGTCTC 4 68C 

TGTGCCACAA CCAAGCTTAA CTTAGTTTTA AAAAAAATCT CCCAGCATAT GTCGCTGCTG 47 4 0 

CTTAAATATT GTATAATTTA CTTGTATAAT TCTATGCAAA TATTGCTTAT GTAATAGGAT 4 600 

TA7TTCTAAA GGTTTCTGTT TAAAATATTT TAAATTTGCA TATCACAACC CTGTGGTAGG 48 60 

ATGAATTGTT ACTGTTAACT TTTGAACACG CTATGCGTGG TAATTGTTTA ACGAGCAGAC 4920 

ATGAAGAAAA CAGGTTAATC CCAGTGGCTT CTCTAGGGGT AGTTGTATAT GGTTCGCATG 4 9e0 

GGTGGATGTG TGTGTGCATG TGACTTTCCA ATGTACTGTA TTGTGGTTTG TTGTTGTTGT 504 0 



TGCTGTTGTT GTTCATTTTG GTGTTTTTGG TTGCTTTGTA TGATCTTAGC TCTGGCCTAG 5! 00 
GTGGGCTGGG AAGGTCCAGG TCTTTTTCTG TCGTGATGCT GGTGGAAAGG TGACCCCAAT 51 €C 



CATCTGTCCT ATTCTCTGGG ACTATTC 



(2) INFORMATION FOR SEQ ID NO: 10: 



SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 1434 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 



tii) 



MOLECULE TYPE: protein 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO: 10: 



Met 



Ala Ser Ala Gly Asn Ala Ala Gly Ala Leu Gly Arg Gin Ala Gly 
5 10 15 



Gly 



Gly Arg Arg Arg Arg Thr Gly Gly Pro His Arg Ala Ala Pro Asp 
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20 25 30 

Arg Asp Tyr Leu His Arg Pro Ser Tyr Cy3 Asp Ala Ala Phe Ala Leu 
35 40 45 

Glu Gin Tie Ser Lys Gly Lys Ala Thr Gly Arg Lys Ala Pro Leu Trp 
50 55 60 

Leu Arg Ala Lys Phe Gin Arg Leu Leu Phe Lys Leu Gly Cys Tyr lie 
65 70 75 BO 

Gin Lys Asn Cys Gly Lys Phe Leu Val Val Gly Leu Leu lie Phe Gly 
85 90 95 

Ala Phe Ala Val Gly Leu Lys Ala Ala Asn Leu Glu Thr Asn Val Glu 
100 105 110 

Glu Leu Trp Val Glu Val Gly Gly Arg Val Ser Arg Glu Leu Asn Tyr 

115 120 125 

Thr Arg Gin Lys lie Gly Glu Glu Ala Met Phe Asn Pro Gin Leu Met 
130 135 140 

lie Gin Thr Pro Lys Glu Glu Gly Ala Asn Val Leu Thr Thr Glu Ala 
145 150 155 160 

Leu Leu Gin His Leu Asp Ser Ala Leu Gin Ala Ser Arg Val His Val 
165 170 175 

Tyr Met Tyr Asn Arg Gin Trp Lys Leu Glu His Leu Cys Tyr Lys Ser 
180 165 190 

Gly Glu Leu He Thr Glu Thr Gly Tyr Met Asp Gin He He Glu Tyr 
195 200 205 

Leu Tyr Pro Cys Leu He He Thr Pro Leu Asp Cys Phe Trp Glu Gly 
210 215 220 

Ala Lys Leu Gin Ser Gly Thr Ala Tyr Leu Leu Gly Lys Pro Pro Leu 
225 230 235 240, 

Arg Trp Thr Asn Phe Asp Pro Leu Glu Phe Leu Glu Glu Leu Lys Lys 
245 250 255 

He Asn Tyr Gin Val Asp Ser Trp Glu Glu Met Leu Asn Lys Ala Glu 
260 265 270 

Val Gly His Gly Tyr Met Asp Arg Pro Cys Leu Asn Pro Ala Asp Pro 
275 280 285 

Asp Cys Pro Ala Thr Ala Pro Asn Lys Asn Ser Thr Lys Pro Leu Asp 
290 295 300 

Val Ala I^eu Val Leu Asn Gly Gly Cys Gin Gly Leu Ser Arg Lys Tyr 
305 310 ' 315 320 

Met His Trp Gin Glu Glu Leu He Val Gly Gly Thr Val Lys Asn Ala 
325 330 335 

Thr Gly Lys Leu Val Ser Ala His Ala Leu Gin Thr Met Phe Gin Leu 
340 345 350 

Met Thr Pro Lys Gin Met Tyr Glu His Phe Arg Gly Tyr Asp Tyr Val 
355 3so 365 
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Ser His lie Asn Trp Asn Glu Asp Arg Ala Ala Ala lie Leu Glu Ala 

370 375 380 

Trp Gin Arg Thr Tyr Val Glu Val Val His Gin Ser Val Ala Pro Asn 
385 390 395 400 

Ser Thr Gin Lys Val Leu Pro Phe Thr Thr Thr Thr Leu Asp Asp lie 
405 410 415 

Leu Lys Ser Phe Ser Asp Val Ser Val He Arg Val Ala Ser Gly Tyr 
«0 425 430 

Leu Leu Met Leu Ala Tyr Ala Cys Leu Thr Met Leu Arg Trp Asp Cys 
435 440 445 

Ser Lys Ser Gin Gly Ala Val Gly Leu Ala Gly Val Leu Leu, Val Ala 
450 455 460 

Leu Ser Val Ala Ala Gly Leu Gly Leu Cys Ser Leu He Gly He Ser 
465 470 475 460 

Phe Asn Ala Ala Thr Thr Gin Val Leu Pro Phe Leu Ala Leu Gly Val 
485 490 495 

Gly Val Asp Asp Val Phe Leu Leu Ala His Ala Phe Ser Glu Thr Gly 
500 505 510 

Gin Asn Lys Arg He Pro Phe Glu Asp Arg Thr Gly Glu Cys Leu Lys 
515 520 525 

Arg Thr Gly Ala Ser Val Ala Leu Thr Ser He Ser Asn Val Thr Ala 
530 535 540 

Phe Phe Met Ala Ala Leu He Pro lie Pro Ala Leu Arg Ala Phe Ser 
545 550 555 560 

Leu Gin Ala Ala Val Val Val Val Phe Asn Phe Ala Met Val Leu Leu 
565 570 575 

He Phe Pro Ala He Leu Ser Met Asp Leu Tyr Arg Arg Glu Asp Arg 
580 585 590 

Arg Leu Asp He Phe Cys Cys Phe Thr Ser Pro Cys Val Ser Arg Val 
595 600 605 

He Glr. 7a: Glu Pro Gin Ala Tyr Thr Glu Pro His Ser Asn Thr Arg 
610 615 620 

Tyr Ser Pro Pro Pro Pro Tyr Thr Ser His Ser Phe Ala His Glu Thr 
625 630 635 640 

His He Thr Met Gin Ser Thr Val Gin Leu Arg Thr Glu Tyr Asp Pro 
645 * 650 655 

His Thr His Val Tyr Tyr Thr Thr Ala Glu Pro Arg Ser Glu He Ser 
660 665 670 

Val Gin Pro Val Thr Val Thr Gin Asp Asn Leu Ser Cys Gin Ser Pro 
675 680 685 

Glu Ser Thr Ser Ser Thr Arg Asp Leu Leu Ser Gin Phe Ser Asp Ser 
690 695 700 
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Ser Leu His Cys Leu Glu Pro Pro Cys Thr Lys Trp Thr Leu Ser Ser 
705 710 715 120 

~ Phe Ala Glu Lys His Tyr Ala Pro Phe Leu Leu Lys Pro Lys Ala Lys 



720 

730 " 735 

Ser 
750 



Val val val lie Leu Leu Phe Leu Gly Leu Leu Gly Val Ser Leu Tyr 

740 745 750 

Gly Thr Thr Arg Val Arg Asp Gly Leu Asp Leu Thr Asp lie Val Pro 
755 760 765 

Arg Glu Thr Arg Glu Tyr Asp Phe lie Ala Ala Gin Phe Ly 3 Tyr Phe 
U 77 5 780 

Ser Phe Tyr Asn Met Tyr He Val Thr Gin Lys Ala Asp Tyr Pro Asn 
"° 795 800 

He Gin His Leu Leu Tyr Asp Leu Hi 3 Lys Ser Phe Ser Asn Val Lys 



Tyr Val Met Leu Glu Glu Asn 
820 



810 815 

Lys Gin Leu Pro Gin Met Trp Leu His 
A2S e30 



Tyr Phe Arg Asp Trp Leu Gin Gly Leu Gin Asp Ala Phe Asp Ser Asp 
835 840 845 

Trp Glu Thr Gly Arg He Met Pro Asn Asn Tyr Lys Asn Gly Ser Asp 

855 860 

Asp Gly Val Leu AJa T Lys ^ ^ 

870 675 eao 

Lys Pro lie Asp lie Ser Gin Leu Thr Lys Gin Arg Leu Val Asp Ala 
885 890 895 

Asp Gly Ue lie Asn Pro Ser Ala Phe Tyr He Tyr Leu Thr Ala Trp 
900 905 sio 

val Ser Asn Asp Pro Val Ala Tyr Ala Ala Ser Gin Ala Asn lie Arg' 
313 920 925 

Pro His Arg Pro Glu Trp Val His Asp L y3 Ala Asp Tyr Met Pro Glu 
SJ ° 93 5 940 

Thr Arg Leu Arg lie p r0 Ala Ala Glu Pro He Glu Tyr Ala Gin Phe 
950 955 960 

Pro Phe Tyr Leu Asn Gly Leu Arg Asp Thr Ser Asp Phe Val Glu Ala 



965 970 



975 



He Glu Ly, val Arg Val He Cya Asn Asn Tyr Thr Ser Leu Gly Leu 
980 985 990 

Ser ser Tyr Pro Asn Gly Tyr Pro Phe Leu Phe Trp Glu Gin Tyr il e 
SS5 1000 10 os 

Ser Leu Arg His Trp Leu Leu Leu Ser Ue Ser val Val Leu Ala Cys 

1015 1020 
Thr Phe Leu Val Cys Al. Val Phe Leu Lou Asn Pro Trp Thr Ma Gly 
1030 1035 1Q4C 
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He He Val Met Val Leu Ala Leu Met Thr Val Glu Leu Phe Gly Met 
1045 1050 1055 

Met Gly Leu He Gly He Lys Leu Ser Ala Val Pro Val Val lie Leu 
1060 1065 1070 

He Ala Ser Val Gly He Gly Val Glu Phe Thr Val His Val Ala Leu 
1075 1080 10B5 

Ala Phe Leu Thr Ala He Gly Asp Lys Asn His Arg Ala Met Leu Ala 
1090 1095 1100 

Leu Glu His Met Phe Ala Pro Val Leu Asp Gly Ala Val Ser Thr Leu 
1105 1110 H15 H20 

Leu Gly Val Leu Met Leu Ala Gly Ser Glu Phe Asp Phe lie Val Arg 
1125 J 1130 " ~ 1135*"" 

Tyr Phe Phe Ala Val Leu Ala He Leu Thr Val Leu Gly Val Leu Asn 
1140 H45 1150 

Gly Leu Val Leu Leu Pro Val Leu Leu Ser Phe Fhe Gly Pro Cys Pre 
HS5 1160 llfis 

Glu Val Ser Pro Ala Asn Gly Leu Asn Arg Leu Pro Thr Pro Ser Pro 
1170 1175 1180 

Glu Pro Pro Pro Ser Val Val Arg Phe Ala Vai Pro Pro Gly His Thr 
H85 1190 H95 120C 

Asn Asn Gly Ser Asp Ser Ser Asp Ser Glu Tyr Ser Ser Gin Thr Thr 
*2Cr 1210 ;215 

Val Ser Gly He Ser Glu Glu Leu Arg Gin Tyr Glu Ala Girt Gin Gly 
1220 1225 1230 

Ala Gly Gly Pro Ala His Gin Val He Val Glu Ala Thr Glu Asn Pro 
1235 1240 1245 

Val Phe Ala Arg Ser Thr Val Val His Pro Asp Ser Arg His Gin Pro 
1250 1255 126C 

Pro Leu Thr Pro Arg Gin Gin Pro His Leu Asp Ser Gly Ser Leu Sei 
1265 1270 1275 129C 

Pro Gly Arg Gin Gly Gin Gin Pro Arg Arg Asp Pre Pre Arg Glu Gly 
1265 1290 1295 

Leu Arg Pro Pro Pro Tyr Arg Pro Arg Arg Asp Ala Phe Glu He Ser 
1300 1305 1310 

Thr Glu C5ly His Ser Gly Pro Ser Asn Arg Asp Arg Ser Gly Pro Arg 
1315 # 1320 1325 

Gly Ala Arg Ser His Asn Pro Arg Asn Pro Thr Ser Thr Ala Met Gly 
1330 1335 134C 

Ser Ser Val Pro Ser Tyr Cys Gin Pro He Thr Thr Vai Thr Ala Ser 
134- 135C 135b 13c: 

Ala Ser Val Thr Val Ala Val His Pro Pro Pro Gly Pro Gly Arg Asn 
1365 1370 1375 
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Pro Arg Gly Gly Pro Cys Pro Gly Tyr Glu 
1360 1385 

- His Gly Val Phe Glu Asp Pro His Val Pro 
1395 i 4 00 

Arg Arg Asp Ser Lys Val Glu Val He Glu 
1410 n 15 

Glu Glu Arg Pro Crp Gly Ser Ser Ser Asn 
1425 1430 

(2) INFORMATION FOR SEQ ID NO: 11: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: H amino acids 

IB) TYPE: anino acid 

IC) STRAND EDNESS : single 
(Di TCrOLOGY: linear 

Ui) MOLECULE TV=£: peptide 



PCT/US97/09553 

Ser Tyr Pro Glu Thr Asp 
1390 

Phe His Val Arg Cys Glu 
1405 

Leu Gin Asp Val Glu Cys 
1420 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 11: 

lie He Thr Pro Leu Asp Cys Phe Trp Glu Gly 
1 = 10 

2) INFORMATION FOR SEQ ID NO: 12: 

(i) SEOJEKtr CHARACTERISTICS: 

(A) LENGTH: i arrdno acids 

(B) TYPE: ar.ino acid 

(C) STRANDECNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION : SEQ ID NO: 12: 

Leu He Val Cly Gly 
1 5 

) INFORMATION* FOR SEQ ID NO: 13: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 7 amino acids 

(B) ' TYPE : amino acid 

(C) STRANDE3NESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: 
Pro Phe Phe Trp Glu Gin Tyr 
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(2 I INFORMATION FOR SEQ ID NO : 1 4 : 

<i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 28 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION; /desc - "primer" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: 
GGACGAATTC AARGTNCAYC ARYTNTGG 
t2i INFORMATION FOR SEQ ID NO: 15: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 26 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc « "primer* 1 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 5 : 
GGACGAATTC CYTCCCARAA RCANTC 
(2) INFORMATION FOR SEQ ID NO: 16: 

(l) SEQUENCE CHARACTERISTICS: 

(A) LENGTH : 27 base pairs 

(B) TYPE: nucleic acid 
{C> STRANDEDNESS: single 
(D> TOPOLOGY: linear 

(iij MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc - "primer" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 16: 
GGACGAATTC YTNGANTGYT TYTGGGA ' 

12) INFORMATION FOR SEQ ID NO: 17: 

(:) SEQUENCE CHARACTERISTICS: 
(A) LEN"JTH: 31 base pairs 
{&) nucleic acid 

IC) STRANDEDNESS: single 
(D) TOPOLOGY: linear 
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<ii) MOLECULE TYPE: other nucleic acid 
(A) DESCRIPTION: /desc - "primer" 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 17: 
CATACCAGCC AAGCTTGTCN GGCCARTGCA T 
(2) INFORMATION FOR SEQ ID NO: 18: 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 5288 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS; single 
<D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: cDNA 



60 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 18: 
GAATTCCGGG GACCGCAAGG AGTGCCGCGG AAGCGCCCGA AGGACAGGCT CGCTCGGCGC 

GCCGGCTCTC GCTCTTCCGC GAACTGGATG TGGGCAGCGG CGGCCGCAGA GACCTCGGGA 120 

CCCCCGCGCA ATGTGGCAAT GGAAGGCGCA GGGTCTGACT CCCCGGCAGC GGCCGCGGCC 180 

GCAGCGGCAG CAGCGCCCGC CGTGTGAGCA GCAGCAGCGG CTGGTCTGTC AACCGGAGCC 240 

CGAGCCCGAG CAGCCTGCGG CCAGCAGCGT CCTCGCAAGC CGAGCGCCCA GGCGCGCCAG 3 00 

GAGCCCGCAG CAGCGGCAGC AGCGCGCCGG GCCGCCCGGG AAGCCTCCGT CCCCGCGGCG 3 60 

GCGGCGGCGG CGGCGGCGGC AACATGGCCT CGGCTGGTAA CGCCGCCGAG CCCCAGGACC 420 
GCGGCGGCGG CGGCAGCGGC TGTATCGGTG CCCCGGGACG GCCGGCTGGA GGCGGGAGGC ' 4 80 

GCAGACGGAC GGGGGGGCTG CGCCGTGCTG CCGCGCCGGA CCGGGACTAT CTGCACCGGC 54 0 

.^CAGCTACTG CGACGCCGCC TTCGCTCTGG AGCAGATTTC CAAGGGGAAG GCTACTGGCC 600 

GGAAAGCGCC ACTGTGGCTG AGAGCGAAGT TTCAGAGACT CTTATTTAAA CTGGGTTGTT 6 60 

a:a:::aaaa aaactgcggc aagttcttgg ttgtgggcct cctcatattt ggggccttcg 

CGGTGGGATT AAAAGCAGCG AACCTCGAGA CCAACGTGGA GGAGCTGTGG GTGGAAGTTG 7 80 

GAGGACGAGT AAGTCGTGAA TTAAATTATA CTCGCCAGAA GATTGGAGAA GAGGCTATGT 840 

t 

TTAA7CCTCA ACTCATGATA CAGACCCCTA AAGAAGAAGG TGCTAATGTC CTGACCACAG* 900 

AAGCGCTCCT ACAACACCTG GACTCGGCAC TCCAGGCCAG CCGTGTCCAT GTATACATGT 960 

ACAACAGGCA GTGGAAATTG GAACATTTGT GTTACAAATC AGGAGAGCTT ATCACAGAAA 1020 

~AC"TACAT GGATCAGATA ATAGAATATC TTTACCCTTG TTTGATTATT ACACCTTTGC ICi-V 

ACTSCT7CTG GGAAGGGGCG AAATTACAGT CTGGGACAGC ATACCTCCTA GGTAAACCTC 1!0 
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CTTTGCGGTG GACAAACTTC GACCCTTTGG AATTCCTGGA AGAGTTAAAG AAAATAAACT 1200 
ATCAAGTGGA CAGCTGGGAG GAAATGCTGA ATAAGGCTGA GGTTGGTCAT GGTTACATGG 12 60 
ACCGCCCCTG CCTCAATCCG GCCGATCCAG ACTGCCCCGC CACAGCCCCC AACAAAAATT 132 0 
CAACCAAACC TCTTGATATG GCCCTTGTTT TGAATGGTGG ATGTCATGGC TTATCCAGAA 1380 
AGTATATGCA CTGGCAGGAG GAGTTGATTG TGGGTGGCAC AGTCAAGAAC AGCACTGGAA i<0 
AACTCGTCAG CGCCCATGCC CTGCAGACCA TGTTCCAGTT AATGACTCCC AAGCAAATGT 1500 
ACGAGCACTT CAAGGGGTAC GAGTATGTCT CACACATCAA CTGGAACGAG GACAAAGCGG 15 60 
CAGCCATCCT GGAGGCCTGG CAGAGGACAT ATGTGGAGGT GGTTCATCAG AGTGTCGCAC 1620 
AGAACTCCAC TCAAAAGGTG CTTTCCTTCA CCACCACGAC CCTGGACGAC ATCCTGAAAT 168 0 
CCTTCTCTGA CGTCAGTGTC ATCCGCGTGG CCAGCGGCTA CTTACTCATG CTCGCCTATG 1740 
CCTGTCTAAC CATGCTGCGC TGGGACTGCT CCAAGTCCCA GGGTGCCGTG GGGCTGGCTG 1 eCC 
oJJ I'JCTGCT GGTTGCACTG TCAGTGGCTG CAGGACTGGG CCTC'J jCTC'A TTGATCGGAA iEc, 

TTTCCTTTAA CGCTGCAACA ACTCAGGTTT TGCCATTTCT CGCTCTTGGT GTTGGTGTGG 192C 

ATGA7GTTTT TCTTCTGGCC CACGCCTTCA GTGAAACAGG ACAGAATAAA AGAATCCCTT 198C 

TTGAGGACAG GACCGGGGAG TGCCTGAAGC GCACAGGAGC CAGCGTGGCC CTCACGTCCA 204 C 

TCAGCAATGT CACAGCCTTC TTCATGGCCG CGTTAATCCC AATTCCCGCT CTGCGGGCGT 2 10C 

TCTCCCTCCA GGCAGCGGTA GTAGTGGTGT TCAATTTTGC CATGGTTCTG CTCATTTTTC 2160 

"ToOAATTCT CAGCATGGAT TTATATCGAC GCGAGGACAG GAG ATT GG AT ATT TTCTGCT Z??\ 

GTTTTACAAG CCCCTGCGTC AGCAGAGTGA TTCAGGTTGA ACCTCAGGCC TACACCGACA 22 80 

CACACGACAA TACCCGCTAC AGCCCCCCAC CTCCCTACAG CAGCCACAGC TTTGCCCATG ,2 340 

AAACGCAGAT TACCATGCAG TCCACTGTCC AGCTCCGCAC GGAGTACGAC CCCCACACGC 2 4 00 

ACGTGTACTA CACCACCGCT GAGCCGCGCT CCGAGATCTC TGTGCAGCCC GTCACCGTGA 2 46 0 

CACAGGACAC CCTCAGCTGC CAGAGCCCAG AGAGCACCAG CTCCACAAGG GACCTGCTCT 252 0 

CCCAGTTCTC CGACTCCAGC CTCCACTGCC TCGAGCCCCC CTGTACGAAG TGGACACTCT 258 0 

CATCTTTTGC TGAGAAGCAC TATGCTCCTT TCCTCTTGAA ACCAAAAGCC AAGGTAGTGG 2 6 40 

TGATCTTCCT TTTTCTGGGC TTGCTGGGGG TCAGCCTTTA TGGCACCACC CGAGTGAGAG 27 00 

ACGGGCTGGA CCTTACGGAC ATTGTACCTC GGGAAACCAG AGAATATGAC TTTATTGCTG 27 60 

CACAATTCAA ATACTTTTCT TTCTACAACA TGTATATAGT CACCCAGAAA GCAGACTACC 2820 

CGAATATCCA GCACTTACTT TACGACCTAC ACAGGAGTTT CAGTAACGTG AAGTATGTCA 28 80 

TGTTGGAAGA AAACAAACAG CTTCCCAAAA TGTGGCTGCA CTACTTCAGA GACTGGCTTC 29 4 C 

h:Y. ; ;a?ttca GGATGCAT77 GACAGTGACT GGGAAACCGG GAAAATCATG CCAAACAATT 2 g : : 

ACAAGAATGG ATCAGACGAT GGAGTCCTTG CCTACAAACT CCTGGTGCAA ACCGGCAGCC 30 6C 
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GCGATAAGCC CATCGACATC AGCCAGTTGA CTAAACAGCG TCTGGTGGAT GCAGATGGCA 312C 
TCATTAATCC CAGCGCTTTC TACATCTACC TGACGGCTTG GGTCAGCAAC GACCCCGTCG 3180 
CGTATGCTGC CTCCCAGGCC AACATCCGGC CACACCGACC AGAATGGGTC CACGACAAAG 32 4 0 
CCGACTACAT GCCTGAAACA AGGCTGAGAA TCCCGGCAGC AGAGCCCATC GAGTATGCCC 3300 
AGTTCCCTTT CTACCTCAAC GGGTTGCGGG ACACCTCAGA CTTTGTGGAG GCAATTGAAA 3360 
AAGTAAGGAC CATCTGCAGC AACTATACGA GCCTGGGGCT GTCCAGTTAC CCCAACGGCT 3420 
ACCCCTTCCT CTTCTGGGAG CAGTACATCG GCCTCCGCCA CTGGCTGCTG CTGTTCATCA 348C 
GCGTGGTGTT GGCCTGCACA TTCCTCGTGT GCGCTGTCTT CCTTCTGAAC CCCTGGACGG 3540 
CCGGGATCAT TGTGATGGTC CTGGCGCTGA TGACGGTCGA GCTGTTCGGC ATGATGGGCC 3 600 
TCATCGGAAT CAAGCTCAGT GCCGTGCCCG TGGTCATCCT GATCGCTTCT GTTGGCATAG 3 6 60 
GAGTGGAGTT CACCGTTCAC GTTGCTTTGG CCTTTCTGAC GGCCATCGGC GACAAGAACC 3720 
GCAGGGCTGT GCTTGCCCTG GAGCACATGT TTGCACCCGT CCTGGATGGC GCCGTGTCCA 31 bC 
CTCTGCTGGG AGTGCTGATG CTGGCGGGAT CTGAGTTCGA CTTCATTGTC AGGTATTTCT 38 4 0 
TTGCTGTGCT GGCGATCCTC ACCATCCTCG GCGTTCTCAA TGGGCTGGTT TTGCTTCCCG 2 SCO 
TGCTTTTGTC TTTCTTTGGA CCATATCCTG AGGTGTCTCC AGCCAACGGC TTGAACCGCC 3 9 60 
TGCCCACACC CTCCCCTGAG CCACCCCCCA GCGTGGTCCG CTTCGCCATG CCGCCCGGCC 4 020 

ACACGCACAG CGGGTCTGAT TCCTCCGACT CGGAGTATAG TTCCCAGACG ACAGTGTCAG 4080 

GCCTCAGCGA GGAGCTTCGG CACTACGAGG CCCAGCAGGG CGCGGGAGGC CCTGCCCACC 4140 

AAGTGATCGT GGAAGCCACA GAAAACCCCG TCTTCGCCCA CTCCACTGTG GTCCATCCCC 4 200 

AATCCAGGCA TCACCCACCC TCGAACCCGA GACAGCAGCC CCACCTGGAC TCAGGGTCCC 4 2€0 

TGCCTCCCGG ACGGCAAGGC CAGCAGCCCC GCAGGGACCC CCCCAGAGAA GGCTTGTGGC 4 32 0 

CACCCCTCTA CAGACCGCGC AGAGACGCTT TTGAAATTTC TACTGAAGGG CATTCTGGCC 4 360 

CTAGCAATAG GGCCCGCTGG GGCCCTCGCG GGGCCCGTTC TCACAACCCT CGGAACCCAG 4440 

CGTCCACTGC CATGGGCAGC TCCGTGCCCG GCTACTGCCA GCCCATCACC ACTGTGACGG 4 500 

CTTCTGCCTC CGTGACTGTC GCCGTGCACC CGCCGCCTGT CCCTGGGCCT GGGCGGAACC 4 5 60 

CCCGAGGGGG ACTCTGCCCA GGCTACCCTG AGACTGACCA CGGCCTGTTT GAGGACCCCC 4 62C 

t 

ACGTGCCTTT CCACGTCCGG TGTGAGAGCTA GGGATTCGAA GGTGGAAGTC ATTGAGCTGC- 4 60 0 

AGGACGTGGA ATGCGAGGAG AGGCCCCGGG GAAGCAGCTC CAACTGAGGG TGATTAAAAT 47 40 

CTGAAGCAAA GAGGCCAAAG ATTGGAAACC CCCCACCCCC ACCTCTTTCC AGAACTGCTT 4 800 

GAAGAGAACT GGTTGGAGTT ATGGAAAAGA TGCCCTGTGC CAGGACAGCA GTTCATTGTT 4 860 

ACTGTAACCG ATTGTATTAT TTTGTTAAAT ATTTCTATAA ATATTTAAGA GATGTACACA 4 920 



Printed from Mimosa 01/18/2000 12:20:28 page -77- 



WO 97/45541 PCTAJS9 7/09 553 

76 

TGTGTAATAT AGGAAGGAAG GATGTAAAGT GGTATGATCT GGGGCTTCTC CACTCCTGCC 4 980 

CCAGAGTGTG GAGGCCACAG TGGGGCCTCT CCGTATTTGT GCATTGGGCT CCGTGCCACA 504 0 

ACCAAGCTTC ATTAGTCTTA AATTTCAGCA TATGTTGCTG CTGCTTAAAT ATTGTATAAT 5100 

TTACTTGTAT AATTCTATGC AAATATTGCT TATGTAATAG GATTATTTTG TAAAGGTTTC 5160 

TGTTTAAAAT ATTTTAAATT TGCATATCAC AACCCTGTGG TAGTATGAAA TGTTACTGTT 5220 

AACTTTCAAA CACGCTATGC GTGATAATTT TTTTGTTTAA TGAGCAGATA TGAAGAAAGC 5280 
CCGGAATT 

(2) INFORMATION FOR SEQ ID NO: 19: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 14 4 7 amino acids 

(B) TYPE; amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 



5288 



(xi) SEQUEKCE DESCRIPTION: SEQ ID NO : 1 9 : 

Met Ala Ser Ala Gly Asn Ala Ala Glu Pro Gin Asp Arg Gly Gly Gly 
1 5 10 15 

Gly Ser Gly Cys lie Gly Ala Pro Gly Arg Pro Ala Gly Gly Gly Arg 
20 25 30 

Arg Arg Arg Thr Gly Gly Leu Arg Arg Ala Ala Ala Pro Asp Arg Asp 
35 40 45 

Tyr Leu His Arg Pro Ser Tyr Cys Asp Ala Ala Phe Ala Leu Glu Gin ' 
50 55 60 

lie Ser Ly5 Gly Lys Ala Thr Gly Arg Lys Ala Pro Leu Trp Leu Arg 
65 70 75 eo 

Ala Lys Phe Gin Arg Leu Leu Phe Lys Leu Gly Cys Tyr He Gin Lys 
85 go 95 

Asn Cys Gly Lys Phe Leu Val Val Gly Leu Leu He Phe Gly Ala Phe 
100 105 HO 

Ala Val Gly Leu Lys Ala Ala Asn Leu Glu Thr Asn Val Glu Glu Leu 

US 120 125 

t 

Trp Val Glu Val Gly Gly Arg Val Ser Arg Glu Leu Asn Tyr Thr Arg 
130 135 140 

Gin Lys He Gly Glu Glu Ala Met Phe Asn Pro Gin Leu Met He Gin 
14 5 150 155 1€0 

Thr Pro Lys Glu Glu Gly Ala Asn Val Leu Thr Thr Glu Ala Leu Leu 
165 170 175 

Gin His Leu Asp Ser Ala Leu Gin Ala Ser Arg Val His Val Tyr Met 
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180 185 190 

Tyr Asn Arg Gin Trp Lys Leu Glu His Leu Cys Tyr Lys Ser Gly Glu 
195 200 205 

Leu He Thr Glu Thr Gly Tyr Met Asp Gin He He Glu Tyr Leu Tyr 
210 215 220 

Pro Cys Leu He He Thr Pro Leu Asp Cys Phe Trp Glu Gly Ala Lys 
225 230 235 240 

Leu Gin Ser Gly Thr Ala Tyr Leu Leu Gly Lys Pro Pro Leu Arg Trp 
245 250 255 

Thr Asn Phe Asp Pro Leu Glu Phe Leu Glu Glu Leu Lys Lys He Asn 

2 60 265 27 0 

Tyr Gin Val Asp Ser Trp Glu Glu Met Leu Asn Lys Ala Glu Val Gly 
275 280 285 

His Gly Tyr Met Asp Arg Pro Cys Leu Asn Pro Ala Asp Pro Asp Cys 
290 295 300 

Pro Ala Thr Ala Pro Asn Lys Asn Ser Thr Lys Pro Leu Asp Met Ala 
3 °5 310 315 320 

Leu Val Leu Asn Gly Gly Cys His Gly Leu Ser Arg Lys Tyr Met His 
325 330 335 

Trp Gin Glu Glu Leu He Val Gly Gly Thr Val Lys Asn Sec Thr Gly 
340 345 350 

Lys Leu Val Ser Ala His Ala Leu Gin Thr Met Phe Gin Leu Met Thr 
355 360 365 

Pro Lys Gin Met Tyr Glu His Phe Lys Gly Tyr Glu Tyr Val Ser His 
370 375 3B0 

He Asn Trp Asn Glu Asp Lys Ala Ala Ala He Leu Glu Ala Trp Gin 
385 390 395 400 

Arg Thr Tyr VaJ Glu Val Val H13 Gin Ser Val Ala Gin Asn Ser Th: 
405 410 415 

Gin Lys Val Leu Ser Phe Thr Thr Thr Thr Leu Asp Asp He Leu Lys 
420 425 430 

Ser Phe Ser Asp Val Ser Val He Arg Val Ala Ser Gly Tyr Leu Leu 
435 440 445 

Met Leu Ala Tyr Ala Cys Leu Thr Met Leu Arg Trp Asp Cys Ser Lys 
4*0 455 460 

" t 

Ser Gin Gly Ala Val Gly Leu Ala Gly Val Leu Leu Val Ala Leu "Ser 
* Sb 470 475 480 

Val Ala Ala Gly Leu Gly Leu Cys Ser Leu He Gly He Ser Phe Asn 
485 490 495 

Ala Ala Thr Thr Gin Val Leu Pro Phe Leu Ala Leu Gly Val Gly Val 
500 505 510 

Asp Asp Val Phe Leu Leu Ala His Ala Phe Ser Glu Thr Gly Gin Asn 
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515 520 525 

Lys Arg lie Pro Phe Glu Asp Arg Thr Gly Glu Cys Leu Lys Arg Thr 
530 535 540 

Gly Ala Ser Val Ala Leu Thr Ser He Ser Aan Val Thr Ala Phe Phe 
545 550 555 560 

Met Ala Ala Leu He Pro He Pro Ala Leu Arg Ala Phe Ser Leu Gin 
565 570 575 

Ala Ala Val Val Val Val Phe Asn Phe Ala Met Val Leu Leu He Phe 
580 585 590 

Pro Ala He Leu Ser Met Asp Leu Tyr Arg Arg Glu Asp Arg Arg Leu 

OUU OU3 

Asp He Phe Cys Cys Phe Thr Ser Pro Cys Val Ser Arg Val He Gin 
610 615 620 

Val Glu Pre Gin Ala Tyc Thr Asp Thr His Asp Asn 7m Arg Tyr Se: 
625 630 635 6<C 

Pro Pro Pro Pro Tyr Ser Ser His Ser Phe Ala His Glu Thr Gin He 

645 650 655 

Thr Met Gin Ser Thr Val Gin Leu Arg Thr Glu Tyr Asp Pro His Thr 

660 665 670 

His Val Tyr Tyr Thr Thr Ala Glu Pro Arg Ser Glu He Ser Val Gin 
675 680 685 

Pro Val Thr Val Thr Gin Asp Thr Leu Ser Cys Gin Sei Pre Glu Ser 
693 695 7C0 

Thr Ser Ser Thr Arg Asp Leu Leu Ser Gin Phe Ser Asp Ser Ser Leu 
705 710 715 720 

His Cys Leu Glu Pro Pro Cys Thr Lys Trp Thr Leu Ser Ser Phe Ala 
725 730 735 

Glu Lys Hi3 Tyr Ala Pro Phe Leu Leu Lys Pro Lys Ala Lys Val Val 
740 745 750 

Val He Phe Leu Phe Leu Gly Leu Leu Gly Val Ser Leu Tyr Gly Thr 
755 760 765 

r.nr Arg Vai Arg Asp Gly Leu Asp Leu Thr Asp lie Val ?ro Arg Glu 
770 775 780 

Thr Arg Glu Tyr Asp Phe He Ala Ala Gin Phe Lys Tyr Phe Ser Phe 
785 790 795 800 

Tyr Asn Met Tyr He Val Thr, Gin Lys Ala Asp Tyr Pro Asn He Gin 
805 810 815 ' 

His Leu Leu Tyr Asp Leu His Arg Ser Phe Ser Asn Val Lys Tyr Val 
820 825 830 

Me:. Leu Glu Glc Asr. Lys Gin Leu Pro Lys Met Trp Leu His Tyr Pr.t 
63^ S40 e<z 

Arg Asp Trp Leu Gin Gly Leu Gin Asp Ala Phe Asp Ser Asp Trp Glu 
850 655 . 660 
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Thr Gly Lys He Met Pro Asn Asn Tyr Lys Asn Gly Ser Asp Asp Gly 
865 670 875 880 

Val Leu Ala Tyr Lys Leu Leu Val Gin Thr Gly Ser Arg Asp Lys Pro 
885 890 895 

He Asp He Ser Gin Leu Thr Lys Gin Arg Leu Val Asp Ala Asp Gly 
900 905 910 

He He Asn Pro Ser Ala Phe Tyr He Tyr Leu Thr Ala Trp Val Ser 
915 920 925 

Asn Asp Pro Val Ala Tyr Ala Ala Ser Gin Ala Asn He Arg Pro His 
930 935 940 

Arg Pro Glu Trp Val His Asp Lys Ala Asp Tyr Met Pro Glu Thr Arg 
945 950 955 960 

Leu Arg He Pro Ala Ala Glu Pro He Glu Tyr Ala Gin Phe Pro Phe 

565 970 975 

Tyr Leu Asn Gly Leu Arg Asp Thr Ser Asp Phe Val Giu Ala He Civ 
^80 985 990 

Lys Val Arg Thr He Cys Ser Asn Tyr Thr Ser Leu Gly Leu Ser Ser 
995 1000 1005 

Tyr Pro Asn Gly Tyr Pro Phe Leu Phe Trp Glu Gin Tyr He Gly Leu 
1010 1015 1020 

Arg His Trp Leu Leu Leu Phe He Ser Val Val Leu Ala Cys Thr Phe 
1025 1030 -035 1040 

Leu Val Cys Ala Va: Phe Leu Leu Asn Pro Trp Thr Ala Gly lie He 
1C<5 1050 105b 

Val Met Val Leu Ala Leu Met Thr Val Glu Leu Phe Gly Met Met Gly 
1060 1065 1070 

Leu He Gly He Lys Leu Ser Ala Val Pro Val Val He Leu He Ala 
1075 1080 1085 

Ser Val Gly He Gly Val Glu Phe Thr Val His Val Ala Leu Ala Phe 
1090 1095 1100 

Leu Thr Ala He Gly Asp Lys Asn Arg Arg Ala Val Leu Ala Leu Glu 
H05 1110 1115 U20 

His Met Phe Ala Pro Val Leu Asp Gly Ala Val Ser Thr Leu Leu Gly 
1125 1130 1135 

Val Leu Met Leu Ala Gly Ser Glu Phe Asp Phe He Val Arg Tyr Phe 
1140 * H45 1150 

Phe Ala Val Leu Ala He Leu Thr He Leu Gly Val Leu Asn Gly Leu 
1155 1160 1165 

Val Leu Leu Pre Val Leu Leu Ser Phe Phe Gly Pro Tyr Pro Glu Val 
1170 1175 H80 

Ser Pro Ala Asn Gly Leu Asn Arg Leu Pro Thr Pro Ser Pro Glu Pro 
1185 1190 1195 120^ 
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Pro Pro Ser Val Val Arg Phe Ala Met Pro Pro Gly Hia Thr His Ser 
1205 1210 1215 

Gly Ser Asp Ser Ser Asp Ser Glu Tyr Ser Ser Gin Thr Thr Val Ser 
1220 1225 1230 

Gly Leu Ser Glu Glu Leu Arg His Tyr Glu Ala Gin Gin Gly Ala Gly 
1235 1240 1245 

Gly Pro Ala His Gin Val He Val Glu Ala Thr Glu Asn Pro Val Phe 
1250 1255 1260 

Ala His Ser Thr Val Val His Pro Glu Ser Arg His His Pro Pro Ser 
1265 1270 1275 1280 

Asn Pro Arg Gin Gin Pro His Leu Asp Ser Gly Ser Leu Pro Pro Gly 
1285 1290 1295 

Arg Gin Gly Gin Gin Pro Arg Arg Asp Pro Pro Arg Glu Gly Leu Trp 
1300 1305 1310 

Pro Pro Leu Tyr Arg Pro Arg Arg Asp Ala Phe Glu He Ser Thr Glu 
1315 1320 1325 

Gly His Ser Gly Pro Ser Asn Arg Ala Arg Trp Gly Pro Arg Gly Ala 
1330 1335 1340 

Arg Ser His Asn Pro Arg Asn Pro Ala Ser Thr Ala Met Gly Ser Ser 
1345 1350 1355 1360 

Val Pro Gly Tyr Cys Gin Pro He Thr Thr Val Thr Ala Ser Ala Ser 
1365 1370 1375 

val Thr Val Ala Val His Pro Pro Pro Val Pro Gly Pro Gly Arg Asr. 
1380 1385 1393 

Pre Arg Gly Gly Leu Cys Pro Gly Tyr Pre Glu Thr Asp His Gly Lcl 
1395 1400 1405 

Phe Glu Asp Pro His Val Pro Phe His Val Arg Cys Glu Arg Arg Asp 
1410 1415 1420 

Ser Lys Val Glu Val He Glu Leu Gin Asp Val Glu Cys Glu Glu Arg 
1425 1430 1435 1440 

Pro Arg Gly Ser Ser Ser Asn 
1445 
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5 WHAT IS CLAIMED IS: 

1 *£ """^ a " afcte/ P^tein other than Drosophila melanogaster 

ES£^ i ^ rf "'-*« ^n.in.e.gthther^^otheXr 

^S^^ » C,aim 1 said^protein is mosquito, 

An^^c add according to Claim ,. wherein said patched protein is a 

An isolated nucleic arM 

^ „ v, lo ,„, j, wncrcm sua patched protein is human 

In isolated nucleic add according to Claim 3, wherein said patched protein is mouse 

^SrS^Sf " 8 * «■«"*» "gion functional in an 

Z^^ZtZSt ha 7 ga fl. uenc ! of 0 *• ta*** nucleic acid according 
wuun i under the transcriptional regulation of sad transcriptional initiation reo™ «„5 
a transactional termination region functional in said expressioThoT * * 
A cell comprising an expression cassette according to Claim 6 as nan of «„ 
c^omosom j element or integrated into the genome of a hoTceu Z ITJt % 
n^onofsa.dexpress.on cassette into said i»st cdl and U« cdlular progen^d 

25 8 t^^^F-^T* P3tched protein - ««" meth °d comprising growing a cell 
9. A purified polypeptide composition comprising at least 50 weicht % of the n m t B ;« 

11. Ap^polypeptide compostion according to Claim 10, wherein said patched protein 
35 i^ P ° , ^ fl "^ 

— pmenta, 

" gS^SvSu^ PrediSP0Si " g mUtati ° n in * *• 

" I"**""* mutation indicate to said individual 

has a genetic predion for at least one of developmental abnorniS Td 



13 
14 

40 
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22. 

23. 

25 24. 

25. 

30 



Ametbod according to Claim 14. wherein said genetic predisposition is basal cell nevus 

iStSSSSSSZ Clalm U ' *** Said ^ the 

A method according to Claim 14, wherein said detecting step comprises functional 
analysis of patched protein function. K 

A vetted I according to Claim 14, wherein said detecting step comprisesdetecting 
antibody binding to abnormal patched protein. 8 

A method for characterizing the phenotype of a tumor, the method comprising: 

— detecting the presence of an oncogenic patched mutation in said rumor, wherein 

tne presence of said oncogenic mutation indicates that said tumor has a patched- 

assocuted phenotype. 

A method according to Claim 19, wherein said tumor is a carcinoma. 

A method according to Claim 20, wherein said carcinoma is a basal cell carcinoma. 

Dr^ofstSw* 10 a0im ^ WherCin deleCting 8tCP ^P"^ ""ah/zing the 

A method according to Claim 19, wherein said detecting step comprises functional 
analysis of patched protein function. 

A method according to Claim 19, wherein said detecting step comprises detecting 
antibody binding to abnormal patched protein. 

A genetically engineered mammalian cell predisposed to develop basal cell carcinoma as 
a result of transection of said mammalian cell with at least one DNA construct 
comprising an altered patched or hedgehog gene. 
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